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INTERACTION-ACTIVATED PROTEINS 

CROSS REFERENCE TO RELATED APPLICATIONS 

This application claims the benefit of US Provisional Application No. 60/124,339, 
filed March, 15, 1999, and US Provisional Application No. 60/135,926, filed May 25, 
1999, and US Provisional Application No. 60/175,968, filed January 13, 2000, which 
disclosures are hereby incorporated by reference. 

GOVERNMENT LICENSE RIGHTS 

The U.S. government has a paid-up license in this invention and the right in limited 
circumstances to require the patent owner to license others on reasonable terms as provided 
for by the terms of grant No. n-^; .r awarded by . 

INTRODUCTION ( 

Technical Field 

The present invention is concerned with detecting interactions between proteins by 
expressing them as part of a fusion sequence that also encodes for one fragment of a , ,i 
fragment pair that reassembles into a directly detectable protein. The interaction-dependent 
enzyme association (IdEA) systems of the present invention are exemplified by the bacterial 
p-lactamases, a large group of structurally-related enzymes which segregate into several 
groups on the basis of strucmral homologies and substrate specificities. 

Background 

Most physiological processes depend on complex networks of cells interacting with 
one another and their environments, primarily through specific recognition between 
proteins - from the ligand-mediated assembly of multi-protein complexes at the cell 
surface, through the labyrinth of intracellular signal transduction cascades, to the assembly 
of transcription-modulating complexes on the promoters of specific genes. Thus, for most 
pathological conditions, protein-protein interactions are instrumental and provide a wealth 
of targets for diagnostic and therapeutic intervention. As a result, new and improved 
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methods are in constant demand for (1) identifying natural ligands of key participants to 
smdy their roles in disease, and (2) developing surrogate ligands for therapeutic 
intervention and diagnosis. A number of methods have been developed over the years to 
address each of these goals. The most widely used current methods for identifying namral 
proteins which interact with a protein-of-interest generally involve screening libraries of 
expressed cDNAs. A few genes for ligands of proteins-of-interest have been isolated by 
direct screening of cDNA expression libraries on filters for binding to labeled versions of 
the protein-of-interest, as in antibody probing (Blackwood and Eisenman, Science (1991) 
257:1211; Defeo- Jones et al.. Nature (1991) i52:251). However, a great many important 
protein interactions are not robust enough for the harshness of such methods, where 
conditions of interaction are usually far from native. Also, the false positive frequencies of 
these methods is high, due to the presence of denatured protein in cells which have been 
fixed to make the target proteins accessible to probes. 

A major advance in cDNA screening methodology came with the* development of 
systems in which screenable or selectable cellular phenotypes could be engineered to 
depend on desired protein interactions within living cells (Fields and Song Nature (1989) 
340:245; Chien etal., Proc Natl Acad Sci (1991) 55:9578; Zervos et aL , Cell (1993) 
72:223; Vojtek et al.. Cell (1993) 74:205; and Luban et aL, Cell (1993) 7J: 1067). The 
most widely used of these is the yeast "two hybrid" system of Fields and Song (1989, 
supra). This system takes advantage of the "modularity" of many functional domains in 
proteins which allows the linking of functions to be manipulated. This is particularly true 
for transcriptional activators, in which an activation domain which interacts with the core 
transcription complex is "homed" to specific genes by a sequence-specific DNA-binding 
domain. For many transcriptional activators these domains can function independently, 
and in fact are often in separate, interacting subunits. In the yeast two-hybrid system, the 
"bait" protein is expressed as a fusion with a cw-element sequence-specific DNA-binding 
domain, and cDNAs are expressed as fusions with a transactivation domain. When, and 
only when, these two domains are brought together by interaction of a cDNA product with 
the "bait" protein, can the reporter gene be expressed, since its transcription is dependent 
on transactivation from the m-element. Reporters can be either screenable (e.g., 
P-galactosidase for color) or selectable (e.g., HfS3 for growth in the absence of histidine). 

Variations of this system have been successfully employed to identify a number of 
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important protein-protein interactions (Chien et al, 1991, supra\ Zervos et al., 1993, 
supra; Vojtek etaL, 1993, supra; and Luban etaL, 1993, supra; Bartel etal., Nature 
Genetics (1996) 2:72; Fromont-Racine et aL, Nature Genetics (1997) J:277; Xu et al, 
Proc Natl Acad Sci (1997) 94: 12473). In spite of its success, however, the original yeast 
two-hybrid system has serious drawbacks for the high-throughput applications required to 
accelerate pharmaceutical target discovery from genomics. The fundamental limitation with 
this system is that many steps are required between the test interaction and the generation 
of a selectable phenotype. Each such step presents an opportunity for non-specific 
interaction to raise the false positive background, and for dissociation to allow bona fide 
interactors to be missed. The false positive problem is exacerbated by the highly 
combinatorial nature of the transcription machinery and the abundance of protein domains 
encoded in cDNA libraries which can interact with one or more components of the 
transcription initiation complex, including transactiv'ator-bound promoter DNA (Bartel 
et al., BioTechniques (1993) 74:920). Another limitation; of the original two-hybrid system 
is that it generally cannot accommodate secreted or nlembrane proteins, and cytoplasmic • 
proteins must be stable in the yeast nucleus. — 

Recently the two-hybrid concept has been expanded to include other types of protein 
fimctionalities for use as protein-protein interaction reporting systems. For example, in the 
Selective Infective Phage (SIP) system a protein which confers infectivity on filamentous 
bacteriophage has been fragmented in such a way that it is functional only when the 
fragments are fused to heterologous interactors (Krebber et aL, J Mol Biol (1997) 
265:607). The interaction is then monitored by its ability to allow phage encoding the 
interactors to transfer a selectable phenotype to susceptible cells by infection. However, 
this method also suffers from requiring many low-efficiency steps between the target 
interaction and the expression of the selectable phenotype by the recipient cell. Also like 
the two-hybrid system, the efficiency of this system suffers from the fact that most natural 
protein-protein interactions have affinities in the micromolar range, with half-lifes on the 
order of seconds. When the time delay between interaction and signal generation exceeds 
this half-life, which it does in these systems, the efficiency of interaction detection declines 
sharply. 

More recently still, the two-hybrid concept has been adapted to proteins which can 
confer selectable phenotypes directly from protein-protein interactions, with few or no 
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intervening steps between the target interaction and signal generation. For example, 
interactors can be fused to variants of the Green Fluorescent Protein of Aequorea victoria 
(GFP), which are capable of detectable fluorescence resonance energy transfer (FRET) 
when brought into close proximity by the interactors (Cubitt et al. Trends Biochem (1995) 
2^448). Some enzymes which confer selectable or screenable phenotypes on cells can also 
be adapted for two-hybrid type protein-protein interaction detection (Rossi et al, Proc Natl 
Acad Sci (1997) 9^^:8405; Pelletier et aL, Proc Natl Acad Sci (1998) 95:12141). In this 
variation, protein interactors are fused to enzyme fragments, which by themselves are 
inactive. However, when the enzyme fragments are brought together by the interaction of 
the protein domains to which they are fused, the fragments are able to associate to 
reconstitute the selectable activity of the enzyme. This is an example of interaction- 
dependent enzyme activation (IdEA), and it is illustrated in Figure 1. Both IdEA and GFP 
FRET systems present advantages over previous versions of the two-hybrid concept. For 
instance, the selectable signal is produced directly from the desired interaction, without ;any 
intervening steps which are the main sources of inefficiency in the earlier systems. :Such ■ 
improvements in efficiency and background should make these methods more amenable to 
high-throughput applications. However, although both IdEA and GFP FRET systems in 
theory can be set up in both prokaryotic and eukaryotic cells, and either in the cytoplasm 
or in a secretory pathway to allow interactions to be monitored. in namral milieus, they 
have not. All IdEA systems reported to date have only utilized cytoplasmic enzymes and 
have only been shown to be operative in that compartment (Rossi et al, 1997, supra\ Pelletier 
et al., 1998, supra; Karimova et aL, Proc Natl Acad Sci (1998) 95:5752). Indeed, because of 
their design, these reported systems would not be expected to function in the secretory 
pathway or in the bacterial periplasm. Thus, they are not considered useful for monitoring 
the interactions of secreted proteins. 

The most widely used current systems for the detection of extra-cellular protein- 
protein interactions, namely viral or cellular display systems, are essentially in vitro methods 
with high stringencies of selection and/or high backgrounds. Thus, they are not well suited for 
high-throughput applications. These systems also usually require the use of a purified known 
heterologous interactor domain or "bait protein", and are therefore not suitable for multiplex 
applications where neither heterologous interactor domain of a protein binding pair is known 
a priori, i.e., the combinatorial interaction of two protein libraries with one another for 
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simultaneous identification of all protein binding pair interactions. One system which does 
not require bait purification for identification of extra-cellular interactions is the E. coli Dimer 
Detection System (HDDS; Small Molecule Therapeutics, Inc., Monmouth Junction, NJ). Bait 
proteins for this system are restricted to type I membrane receptors which have single 
transmembrane domains and require simple dimerization for signaling. The ecto-domain of 
the bait receptor is fused to the transmembrane domain and endo-domain of an coli 
receptor. When this fusion protein is co-expressed with an expression library in the bacterial 
periplasm, ligands for the receptor can be identified by their ability to dimerize the receptor 
and induce expression of a selectable phenotype. However, this system suffers from the same 
limitation as the yeast two-hybrid and SIP systems, namely, that multiple steps between 
interaction and phenotype cause severe loss of efficiency due to high false positive and false 
negative rates. 

It is therefore of interest to develop IdEA systems capable of simultaneous detection 
of multiple- interactions between extra-cellular as well as intracellular proteins in.a high 
throughput: format. , .: . ,.r;t;_;;.; 

Relevant Literature 

USPN 5,585,245 discloses a ubiquitin-based protein sensor complementation system 
where binding of two predetermined proteins of a binding pair is detected as specific 
proteolysis of ubiquitin by ubiquitinases. PCT publication WO 98/44350 discloses a 
reporter subunit complementation system employing fusion proteins of P-galactosidase 
subunits. PCT publication WO 98/34120 discloses a protein fragment complementation 
system employing dihydrofolate reductase. 

SUMMARY 

Compositions and methods are provided for identifying interactions between 
polypeptides using an interaction-dependent protein association system. The system is 
characterized by using fragment pairs comprised of a first and a second member that 
functionally reassemble into a marker protein having a directly detectable signal, such as a 
visible phenotypic change or antibiotic resistance. The fragment complementation system 
of the present invention involves co-expression in a host cell of a first and a second 
oligopeptide, where each is a fusion protein separated by a flexible polypeptide linker with 
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a member of the marker protein fragment pair. Binding of the first oligopeptide to the 
second oUgopeptide results in the ftinctional reconstitution of the fragment pair into a 
marker protein, and the interacting first and second oligopeptides are identified by isolating 
and sequencing plasmids from a host cell that displays a directly detectable signal indicative 
of the marker protein. Functional reconstitution of the fragment pairs into a marker protein 
can be enhanced by including elements such as a cysteine residue or a randomly encoded 
peptide of from 3-12 amino acids at or near the break-point termini of the fragment pair 
member, or by introducing 1-3 codon changes within the nucleotide sequence encoding for 
a member of a fragment pair. The invention also provides for efficient methods of finding t 
functional fragment pairs of a marker protein that involve identifying functional break- 
points within flexible loops using tertiary or secondary structural information. The 
interaction-dependent protein activation systems of the present invention find particular use 
in identifying immunoglobulin epitopes, polypeptide sequences that bind to extracellular : 
proteins, and inhibitors of phophorylation^regulated signal ;transducer proteins. The' - ,f 
systems also find use in allowing single, antibiotic selection of cells transformed to express 
genes for multiple traits and for targeted and localized activation of derivitized anti-tumor 
prodrugs. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1. Mechanism for Interaction-dependent Enzyme Activation (IdEA). Interaction- 
dependent fragment complementation requires enzyme a and co fragments which can refold 
to form active enzyme when and only when they are brought together by an interaction of 
heterologous domains fused to their termini. 

Figure 2. Nucleotide coding sequem;i&'-f6r ;he mature form of TEM-1 p-lactamase and the 
^ancoded amino acid sequence.(Stitcliffe, Proc Natl Acad Sci (1978) 75:3737). From the 
sequence for plasmid np«f522 (SYNPBR322), Genbank accession no. J01749. The break- 
points between tlje^ and co fragments at residues Asn52/Ser53, Glu63/Glu64, 
Gln99/Asn>0(5rProl74/Asnl75, Glul97/Leul98, Lys215/Val216, Ala227/Gly228 and 
Gly253^ys254 are indicated. 
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Figure 3. Three-dimensional structure of mature TEM-1 p-lactamase. Rendering of the 
x-ray crystal structure of Jelsch et al. (Proteins Struct Fund (1993) 76:364ff), using red 
and blue solid ribbons to show a-helix and p-sheet, respectively. The molecule is oriented 
to emphasize the two-domain structure (a-cp and ft)- The active site nucleophile, SerTO, is 
shown as a ball-and-stick model. 

Figure 4. Three-dimensional representation of interaction-dependent activation of 
P-lactamase by fragment complementation. Docking of TEM-1 al97 and 0)198 fragments 
by the interaction of the hetero-dimerizing helixes from the fos and jun subunits of the 
AP-1 transcription activator allows re-folding of the fragments into the active conformation 
of the enzyme (compare with Figure 3). 

Figure 5. Structures of some anti-cancer drugs and their cephalosporin iprodrugs. ' 
YW^200 and YW-285 are- a DNA-binding tri-indole and its cephalosporin prodrug (Wang 
1998, m-Patent 5,843,937) ^, - .t^r^-n- : ■ . 

Figura6. Vectors and strategy for the expression of heterologous proteins as fusions to 
the al9]r and col98 fragments of TEM-1 p-lactamase for interaction-dependent p-lactamase 
activatiomby fragment complementation. Vector pAOl is a high-copy pUC119-based 
phagemid fbr expression of 0)198 fusions and free ligands from dicistronic transcripts, 
which can beVrescued as phage for quantitative introduction into host cells by high- 
Itiplicity inflection. Vector pAEl is a low-copy pl5A replicon with a strong promoter 
r expression o\al97 fusions at comparable or higher levels than expression from the 
pAOl vector. Trk)eps are 12-mer peptides inserted into the active site of thioredoxin. 
Tripep-trx libraries Ve random tri-peptides at the N-terminus of thioredoxin with an 
intervening Gly4Ser lifter. ScFv, single-chain antibody Fv fragment. LC-CHl, antibody 
fragment composed of light chain and first constant region of heavy chain. VL, antibody 
light chain variable region^ lac prom, lactose operon promoter. SP, signal peptide. 
(Gly4Ser)3, flexible 15-mer Vnker. pUC ori, pl5A ori, plasmid origins of replication, fl 
ori, filamentous phage originVf replication, cat, chloramphenicol resistance gene, m.o.i., 
multiplicity of infection, trc prom, fusion promoter from tryptophan and lactose operons. 
rr, transcription terminator, /:<3/z, Vanamycin resistance gene. Vector sizes in base pairs 
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Figure 7. TEM-1 p-lactamase fragment complementation by interaction between 
representative single-chain antibody Fv fragment (scFv) and thioredoxin-scaffolded peptide 
(Trx). The N-terminal P-lactamase fragment, al97 (a), is colored red. The C-terminal 
fragment, 0)198 (co), is colored blue. TEM-1, thioredoxin, and the scFv were rendered 
from published structures. The peptide and the linkers were drawn in. 



Figure 8. TEM-1 P-lactamase fragment complementation by interaction between the CD40 
extra-cellular domain (CD40) and a thioredoxin-scaffolded peptide (Trx). The N-terminal 
P-lactamase fragment, al97 (a), is colored red. The C-terminal fragment, ©198 (w), is- 
colored blue. TEM-1, thioredoxin, and the scFv were rendered from published structures. 
The peptide and the linkers were drawn: in. ^ • , , . . 

Figure 9, Vectors and protocol for construction of a multiplex, protein-protein interaction y^i 
library using interaction-dependent p-lactamase fragment complementation systems. .: . ' 
•Expressed sequence (ES), i.e., random-primed cDN A libraries, are subcloned into 
phagemi^vectors for expression as fusions to the p-lactamase a and o fragments, via the 
flexible linker (Gly4Ser)3. The vectors encode* a peptide epitope tag, such as the 12-residue - 
Myc tag, at tfte C-terminus of the ES. When co-expressed with anti-Tag scFv, such as anti- 
myc 9E10, fusecl to the other fragment, the ES libraries can be selected for p-lactamase 
activity driven by the Tag-anti-Tag interaction, which will require stable expression of the 
ES fragment. The resultant libraries, enriched for stable expressors of autonomously 
folding domains (AFDV may then be rescued as phage and co- infected into male cells for 
selection of interacting AFD pairs (Multiplex Interaction Library), The AFD libraries can 
also be co-infected with scPv libraries, antibody light chain variable region libraries (VL), 
or peptide libraries displayed Vi thioredoxin (trx-peptide) for simultaneous selection of 
binding proteins for each AFD (^ultiplex Antibody /Peptide Binder Selection). See legends 
to Figures 6 and 10 for identification of other abbreviations. 



Figure IV. Abbreviated output of the PredictProtein Program for prediction of secondary 
ructure a^d solvent exposure for NPTII (Rost and Sander, 1993, 1994). The top line 
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shows the amino acid sequence in single letter code. The second and third lines show 
secondary strucmre prediction. H, helix; E, strand; L, loop. The fourth line shows a 
gureVf reliability on a scale from 1 to 10, with 10 being highest. The fifth line shows 

S^olv^t accessibility - e, exposed; b, buried. The bottom line shows a measure of reliability 
fo/ solvent accessibility on a scale of 1 to 10, with 10 being highest. Ten regions of the 
sequence predicted to have little secondary structure and to be exposed to solvent are 
indicated by underWiing as potential sites for productive fragmentation. 



Figuire 11. Expression vectors for production of p-laca253 and p-laccD254>fusion proteins 
V^w^th sc^. Arrows denote translation start sites. T7 prom, bacteriophage T7 promoter; 
SP, pelB signal peptide; scFv is comprised of VH (antibody heavy chain variable region), 
/Gly4Ser)3 (r^mer flexible linker), and VL (antibody light chain variable region); kan, 
(kanamycin resiWnce; His^, hexa-histidine tag for metal ion affinity purification; lacF, * 
mgh-affinity /ac oj^ron repressor mutant; fl ori, phage origin of replication. . 

BRIEF DESCRIPTION OF THE SPECIFIC EMBODIMENTS s 

Methods and compositions are provided for an interactiouTdependent protein 
activation system useful in detecting an interaction between a first protein and a second 
target protein. The method detects the interaction of a first known or unknown interactor 
domain with a second unknown interactor domain by bringing into close proximity 
members of a fragment pair of a marker protein, such that the parent marker protein is 
reassembled to its original functionality, and such that reassembly requires the prior 
interaction of the heterologous interactor domains. The system is characterized by N- 
terminal and C-terminal fragment members that comprise fragment pairs which are derived 
from, and can functionally reassemble into a marker protein that provides for a directly 
detectable signal that does not involve downstream steps necessary for recognition. For 
example, a marker protein of interest for the instant invention functions of itself to produce 
a selectable signal such as a visible phenotypic change or antibiotic resistance. 

The fragment pairs are used in methods that involve the co-expression of a first and 
a second oligopeptide sequence, in which the first oligopeptide sequence is a fusion protein 
comprised of in the direction of translation, an N-terminal fragment fused through a break- 
point terminus to a flexible polypeptide linker and a first interactor domain, and the second 
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oligopeptide sequence is a fusion protein comprised of in the direction of translation, a 
second interactor domain and a flexible polypeptide linker fused through a break-point 
terminus to a C-terminal fragment. The flexible polypeptide linker separates the fragment 
domain from the interactor domain and allows for their independent folding. The linker is 
optimally 15 amino acids or 60 A in length (-4 A per residue) but may be as long as 30 
amino acids but preferably not more than 20 amino acids in length. It may be as short as 3 
amino acids in length, but more preferably is at least 6 amino acids in length. To ensure 
flexibility and to avoid introducing steric hindrance that may interfere with the independent 
foidingfof the fragment domain and the interactor domain, the linker should be comprised 
of small, preferably neutral residues such as Gly, Ala and Val, but also may include polar 
residues that have heteroatoms such as Ser and Met, and may also contain charged 
residues. 

The first interactor domain is a known or unknown protein or protein fragment that;- 
' binds directly or indirectly to a second target interactor domain that is an unknown proteihhy .: 
or protein fragment and either or both the first and second interactor domain can he a -.\ .<Vr. : ; 
niemher of a library. The interactor domain librariesrare preferably constructed from VJ) ; ; 
cDNA, ibut may also be constructed from, for example, synthetic DNA, RNA and genomic- v 
DNA. When combining the first and second oligopeptide sequences, the reconstitution of 
: the N-terminal and C-terminal fragments into the marker protein requires the prior 
interaction of the first and second interactor domains. Bound interactor domains are . • 
identified by expressing a functionally reconstituted marker protein, and then the nucleotide 
sequences encoding for bound interactor domains or the bound interactor domains 
themselves are characterized by methods including electrophoresis, polymerase chain 
reaction (PGR), nucleotide and amino acid sequencing and the like. 

Advantages of the present invention over previously disclosed fragment 
complementation systems include a reporter protein that provides for a directly detectable 
signal upon reassembly, and background levels of 1 in 10^ or less. Additionally, the 
invention provides for rationally incorporated enhancement modifications to the fusion 
oligopeptides that increase the functional activity of the reconstimted protein to wild-type 
levels by improving folding and reassembly of the fragments into the parent protein, while 
at the same time maintaining dependence on the interactor domains for reassembly. 

The interaction-dependent enzyme activation system of the subject invention may be 
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used to detect in vitro protein interactions, such as in cell ly sates, or the interactions of 
intracellular or extracellular proteins of a host cell. For evaluating interactions between 
extracellular proteins, the first and second fusion oligopeptides can be expressed with a 
signal peptide. In bacterial host cells, for example, an N-terminal signal peptide can 
provide for translocation of the fusion oligopeptides to the periplasm. The combined 
lengths of the N-terminal fragment and the C-terminal fragment may be discontinuous with 
residues around the break-point deleted, contiguous, or overlapping with residues around 
the break-point repeated, thereby comprising from 90% to 110% of the total length of the 
parent protein. Break-point termini are herein defined as the C-terminus of the. N-terminal 
fragment and the N-terminus of the C-terminal fragment. 

The subject invention provides for enhancing the performance of the reassembled 
parent protein by introducing at least one of the following modifications, including: i) a 
randomly-encoded peptide of 3-12 amino acids between the break-point terminus of each - ' 
fragment and the flexible polypeptide linker;^ ii) a,randomly-encoded/peptideiOf?3-12 amino : 
acids expressed', separately as: a fusion to the , N-terminus of a thioredoxin .withian . ; , 
intervening flexible linker,, iii) a cysteine residue encoded at or within 5.amino acid /■ 
positions of the break-point and between the break-point terminus of each fragment and the 
flexible polypeptide linker so that a disulfide bond can form between the members of a 
fragment pair, and iv) rl-3 codon changes within a member of a fragment pair introduced, 
for example, by PGR amplification of a nucleotide sequence encoding for a member of a 
fragment pair under error-prone conditions, to enhance the folding stability of a 
functionally reconstituted marker protein. 

The invention is also directed to plasmids containing expression cassettes 
constructed to express fusion oligopeptides comprised of a fragment domain and an 
interactor domain. The expression cassettes for the N-terminal and C-terminal fragment 
pair members are designed with their components in different sequential orders. For the 
C-terminal fragment pair member, the expression cassette will comprise as operably linked 
components in the direction of transcription nucleotide sequences encoding for (i) a 
promoter functional in a host cell, (ii) a polypeptide interactor domain, (iii) a flexible 
polypeptide linker and (iv) a C-terminal fragment of a marker protein that provides for a 
directly selectable phenotype. The expression cassette for the N-terminal fragment pair 
member will comprise as operably linked components in the direction of transcription 
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nucleotide sequences encoding for (i) a promoter functional in a host cell, (ii) an 
N-terminal fragment of a marker protein that provides for a directly selectable phenotype, 
(iii) a flexible polypeptide linker and (iv) a polypeptide interactor domain. The invention is 
also concerned with host cells that contain plasmids having the sequences of the above- 
5 described expression cassettes. 

Appropriate host cells for application of the subject invention include both 
eukaryotic cells, such as mammalian, yeast and plant cells, and prokaryotic cells, such as 
bacterial cells. A variety of prokaryotic expression systems can be used to express the 
fusion oligopeptides of the subject invention. Expression vectors can be constructed which 
10 contain a promoter to direct transcription, a ribosome binding site, and a transcriptional 
terminator. Examples of regulatory regions suitable for this purpose in E. coli are the 
promoter and operator region of the E, coli tryptophan biosynthetic pathway as described 
y ^by'^tYanofsky. (1984) J. BacterioL, 158:1018-1024 and the leftward promoter of phage ■ 

ifi i i lambda. (PXi) as described by Herskowitz and Hagen^ (*L980) Ann. Rev.. Genet., 14:399r445. -O^mr: 
m 15 - MVectorsmsed for expressing foreign genes in bacterial;hosts generally? will contain a ^ . - iisUi^ 
j ^ sequence for a promoter which functions in thevhost cell - Plasmids useful .for transforming 
m bacterial include pBR322 (Bolivar, et al, (1917) Gene 2:95-1 13), the pUC plasmids v> 

(Messing, (1983) Meth. EnzymoL 101:20-77, Vieira and Messing, (1982) Gene 19:259- 
268), pCQV2 (Queen, ibid.), and derivatives' thereof . ' Plasmids may contain both viral and • 
in 20 ^bacterial elements. Methods for the recovery^of^the proteins in biologically active form are v 
i2 discussed in U.S. Patent Nos. 4,966,963 and 4,999,422, which are incorporated herein by 

reference. See Sambrook, et al (In Molecular Cloning: A Laboratory Manual, 2™* Ed,, 
1989, Cold Spring Harbor Laboratory Press, Cold Spring Harbor) for a description of 
other prokaryotic expression systems. 
25 For expression in eukaryotes, host cells for use in practicing the present invention 

include mammalian, avian, plant, insect, and fungal cells. As an example, for plants, the 
choice of a promoter will depend in part upon whether constimtive or inducible expression 
is desired and whether it is desirable to produce the fusion oligopeptides at a particular 
stage of plant development and/or in a particular tissue. Expression can be targeted to a 
30 particular location within a host plant such as seed, leaves, fruits, flowers, and roots, by 
using specific regulatory sequences, such as those described in USPN 5,463,174, USPN 
4,943,674, USPN 5,106,739, USPN 5,175,095, USPN 5,420,034, USPN 5,188,958, and 
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USPN 5,589,379. 

Where the host cell is a yeast cell, transcription and translational regions functional 
in yeast cells are provided, particularly from the host species. The transcriptional initiation 
regulatory regions can be obtained, for example from genes in the glycolytic pathway, such 
as alcohol dehydrogenase, glyceraldehyde-3-phosphate dehydrogenase (GPD), 
phosphoglucoisomerase, phosphoglycerate kinase, etc. or regulatable genes such as acid 
phosphatase, lactase, metallothionein, glucoamylase, etc. Any one of a number of 
regulatory sequences can be used in a particular situation, depending upon whether 
constitutive or induced transcription is desired, the particular efficiency of the promoter in 
conjunction with the open-reading frame of interest, the ability to join a strong promoter 
with a control region from a different promoter which allows for inducible transcription, 
ease of construction, and the like. Of particular interest are promoters which are activated 
in the presence of galactoselv Galactose-induciblespromoters (GALl, GAL?,, and GALIO) 
have been extensively utilized for- high level and regulated expression of. protein in yeast. 
: (Lue.et al, (1987) MoL Cell5:Biol: 7:3446; Johnston, (1987) Mcwi?/^?/.?/?^;; 51i:458). i 
: The invention also provides for efficient methods of identifying functional fragment 
pairs of a marker protein of interest that involves preparing a multiplicity idf fragment pair 
members with break-point termini within a solvent exposed loop or a flexible loop defined 
by tertiary or secondary structure analysis to obtain a fragment pair library . The fragment 
pair members are expressed, in a multiplicity of host cells, and the hostxells exhibiting the 
directly detectable signal associated with the marker protein of interest are isolated as 
indicative of containing fragment pair members that functionally reconstitute the marker 
protein. Plasmids containing expression cassettes coding for the fragment pair members 
are then sequenced to identify functional fragment pairs. To aid in the identification of 
functional fragment pair members of a marker protein of interest, the fragment pair 
members can be expressed as fusion proteins with interactor domains known to bind to 
each other, such as the fos and jun transcription factors that associate through a leucine 
zipper interaction. The sequences encoding the hetero-dimerizing helices of the fos and jun 
transcription factors are sufficient to use as effective interactor domain for this purpose. 

The systems and methods of the subject invention find particular use in identifying 
epitopes recognized by immunoglobulin molecules, polypeptide sequences that bind to 
extracellular domains of a transmembrane protein, inhibitors of phosphorylation-regulated 
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signal transducer proteins, and interaction between oligopeptides of two different 
proteomes. For the identification of epitopes, first and second fusion oligopeptides 
comprised of a fragment domain and an interactor domain are expressed in a host cell 
where the first fusion oligopeptide has an interactor domain comprised of a randomly 
encoded peptide inserted into the active site of a thioredoxin protein and the interactor 
domain of the second fusion oligopeptide is comprised of a single-chain variable region 
(scFv) or antibody light chain variable region (VL). A similar strategy is followed for 
identifying polypeptide sequences that interact with the extracellular domain of a 
transmembrane protein, where the first interactor domain is comprised of a randomly 
encoded peptide inserted into the active site of a thioredoxin protein and the second 
interactor domain is comprised of a transmembrane protein. Identification of inhibitors of 
a phosphorylation-regulated signal transduction protein involves expressing a first fusion 
oligopeptide i with a first interactor domain cdmprisediofa phosphorylation-regulated signal 
nransduGtion.^protein, such as- Her-2/neu, and a second:fusion' oligopeptide with a. second v. ; 
.interactor domain comprised- of a scFv or antibody, light chain variable region that only ■ .i: i 
^( binds- to the unphosphorylated signal transduction protein: i Inhibitory compounds are 
identified from host cells that change color in the presence ;of a chromogenic p-lactamase 
substrate. For identifying or monitoring polypeptide-polypeptide interactions between the 
members of two different proteomes, members of a first and second cellular expression 
library^ comprise the first and second interactor domain, respectively, of a fusion 
oligopeptide. The expression library is preferably a cDNA library, but may also be 
constructed from synthetic nucleotides to screen randomly generated polypeptides. A 
library of particular application for the present invention should represent all the protein 
members of a proteome of interest. Libraries derived from nucleotide sequences that all 
members of a total protein population (i.e. a proteome) of interest may be isolated from a 
host cell such as a prokaryotic or a eukaryotic cell, or also from a viral host. Viral hosts 
that encode for oncogenes are of particular interest. Mammalian tumor cells, immune cells 
and endothelial cells also provide proteomes of particular interest for the subject invention. 

The invention also finds use in selecting with a single marker protein the 
incorporation of multiple genetic traits in a host cell, where detectable expression of a 
functionally reassembled marker protein is indicative of co-expression of multiple genes 
that encode for individual traits in a host. Finally, the invention provides therapeutic utility 
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in a method for specifically activating derivitized prodrugs in the vicinity of a target organ 
in a host, where each member of a marker protein fragment pair is expressed as a fusion 
protein with individual immunoglobulin molecules that recognize neighboring but non- 
overlapping epitopes on a target protein. Binding of both antibodies to the target protein 
allows functional reconstitution of the marker protein which then activates subsequently 
administered prodrug only in the vicinity of a target organ. 

The invention is exemplified by the antibiotic resistance enzyme, TEM-1 p- 
lactamase, although fragment pairs of other enzymes that provide for antibiotic resistance 
are included in the present invention, including: aminoglycoside phosphotransferases, ; 
particularly neomycin phosphotransferase, chloramphenicol acetyl transferase, and the 
tetracycline resistance protein described by Backman and Boyer {Gene (1983) 26\ 197). 
Other proteins that can directly elicit a visible phenotypic change such as a color change or 
fluorescence emission also are applicable to the subject invention. Examples of such 
proteins include p-galactosidase:and green^fluorescent protein (GFP) op other;related; : . 
fluorescent proteins: U ■ .au^ ^ r ; ^-xi c !u r 

The ;TEM-1 p-lactamase <of E, \coli is the 264 amino; acid product of the*ampicillin 
resistance gene of plasmid pBR322 (Sutcliffe, 1978, supra), the nucleotide sequence, of i 
which is shown in Figure 2 along with the encoded amino acid sequence. TEM-1 is the 
archetype member of the homologous Class A p-lactamases, or penicillinases. Its three- 
dimensional structure is shown in Figure 3 (Jelsch et al.. Proteins Struct Funct (1993) 
i6:364ff). The Class A p-lactamases are comprised of two domains. One domain, a-co, is 
made up of N-terminal and C-terminal sequences, which form an anti-parallel two-helix 
bundle packed against a flat 5-stranded p-sheet. The inner face of the sheet packs against 
the other domain (/^), a seven helix bundle with two extended loops and two small p- 
structures. An outside strand of the p-sheet borders the substrate binding pocket, opposite 
the catalytic nucleophile, Ser70, and contributes substrate-binding residues. The remainder 
of the active site residues, including Ser70, are contributed by the ^ domain. The two 
domains are connected by two loops: R61-R65 and D214-W229. 

The subject invention also provides a method of identifying optimal break-points in 
a parent protein that provides for a directly detectable signal. A search of the "fragment 
space" of TEM-1 P-lactamase was conducted to identify fragment pairs which could 
complement for activity only when the break-point termini of the fragments were 
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genetically fused to hetero-dimerizing helixes from the c-fos and c-jun subunits of the AP-1 
transcription factor (Karin et al., Curr Opin Cell Biol (1997) 9:240. To do this, libraries 
of all possible N- and C- terminal fragments of the enzyme were generated by progressive 
exonucleolytic digestion of the full coding sequence from both termini. Fragments of less 
than 25 amino acids were considered non-viable. When libraries were constructed with 
compatible vectors, the fragment sequences co-expressed in the same E. coll cells so that 
each cell expressed a single pair of N- and C- terminal fragments and every possible pair 
may be represented. For example, for a 100 kDa enzyme there are only 10^ possible N- 
and C-terminal fragment pairs, so an exhaustive search of the fragment space of most 
enzymes could be conducted with libraries of a manageable size. An exposed loop was 
identified by- this method between two a-helixes of E. coli TEM-1 p-lactamase 
(approximately Thrl95 to Ala202, between helixes 7 and 8) within which the chain could 
be broken to produce fragments which could only complement for activity when 'fused to 
the/oj and ywrt helixes. Representative fragments with contiguoiis break point termini at r 
Glul97 andvLeul98 were designated a 197 (N-terminaL fragmeiit) and col98 (C-terminal ? 
fragment); 'and subsequently shown to produce selectable. activity in the E. coli periplasm 
with interactions between a variety of heterologous domains fused to the break-point 
termini, including single-chain antibody Fv fragments (scFv), antibody light chains (LC), 
thioredoxin with 12-mer peptides inserted into the active site (trxpeps), and the extra- 
cellular domain of the B-cell activation antigen CD40 (CD40ED). Activation by 
complementation of al97 and 0)198 could also be driven by interaction of the heterologous 
domains with a third polypeptide, such as a receptor. Contiguous break-point termini of 
interest in £. coli TEM-1 p-lactamase in addition to E197/L198 include amide-bond 
junctions between amino acid residues N52/S53, E63/E64, Q99/N100, P174/N175, 
K215/V216, A227/G228, and G253/K254. The combined lengths of the fragment pairs 
may be discontinuous or overlapping, however, comprising from 90% to 110% of the total 
length of the parent protein, and the actual break-point could be within ten amino acid 
residues in either direction from an identified functional contiguous break-point junction. 
The specific activity of the reconstituted enzyme can be enhanced to near wild-type levels . 
by the interaction-driven formation of a disulfide at the break-point, which restores the 
integrity of the native polypeptide backbone (see Figure 4). 

The p-lactamase a 197 and 0)198 fragments cooperatively produce selectable activity 
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in the bacterial periplasm in a manner that is strictly dependent on specific interaction 
between heterologous domains fused to the break-point termini of the fragments is an 
example of an enzyme-based molecular interaction sensor that can undergo secretory 
translocation across a plasma membrane into an extra-cellular compartment, and therefore 
5 can reliably detect interactions between and among extra-cellular proteins. 

The interaction-dependent enzyme association systems of the present invention finds 
use in many applications in human therapeutics, diagnostics, and prognostics, as well as in 
high-throughput screening systems for the discovery and validation of pharmaceutical 
targets and drugs. \ 
10 One particular application is concerned with the localized and controlled activation 

r ' of inactive or weakly active compounds. For example, many useful compounds, such as 
drugs, chromophores, and fluorophores, can be inactivated by conjugation of an essential 
!^ . " moiety on the compound, such as a hydroxyl or amino group, to a substrate for enzymatic \- 
dlVi-j m hydrolysis, such as an ester, amide, carbarriate,' .'phosphate,, glycoside, or glucuronider •: >n ^ y 
;Sm5 t \\ (Jungheim and Shepherd, Chem /?ev;^(1994) 94:1553).* Such conjugates can,then beJ c 4r : 
g^.-: , '-^ ^activated by the appropriate hydroly tie. enzymes such as esterases, carboxypeptidases, . . i- j-; 



(1998) 41:1501; Vrudhula etaL^JMed Chem (1995) J5:1380; Jungheim and Shepherd, 
1994, supra; Alexander et al. Tetrahedron Lett (1991) i2:3269; see also Figure 5). All of 
these are good substrates for broad spectrum p-lactamases, and most are much less active 
than their parent drugs. As a result, these prodrugs are promising candidates for use in 
25 Antibody-Directed Enzyme Prodrug Therapy (ADEPT; Bagshawe, Drug Devel Res (1995) 
34\22Q). In addition to these compounds a vast array of antibiotics (Holbrook and Lowy, 
Cancer Invest (1998) 76:405), as well as a variety of chromogenic and fluorogenic 
substrates have been developed for p-lactamases (Jones et al., J Clin Microbiol (1982) 
75:677; Jones et al.,J Clin Microbiol (1982) 75:954; Zlokarnik et al.. Science (1998) 
30 279:S4), making them one of the most versatile known classes of enzymes. 

Nevertheless, the utility of such enzymes would be greatly enhanced if they could 
be engineered so that their catalytic activities could be positively controlled by allosteric 




alkaline phosphatases, glycosidases, glucuronidases, p-Iactamases, and Penicillin-amidases 
In one particularly versatile system, cephalosporins may be conjugated at the 3' position 
via a variety of different leaving groups to a variety of anti-cancer drugs, such as nitrogen 
mustards, methotrexate, anthracy dines, and vinca alkaloids (Svensson et aL, J Med Chem 
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interaction with ligands of choice. In this way the catalytic power of these enzymes could 
be harnessed to multiple new applications, including (1) rapid, ultra-sensitive detection of 
trace analytes and pathogens in biological specimens or in food, (2) targeted activation of 
therapeutic and diagnostic reagents at specific locations in the body, (3) rapid enrichment 
of expressed sequence libraries for autonomously folding domains (AFDs), (4) massive 
parallel mapping of pair-wise protein-protein interactions within and between the proteomes 
of cells, tissues, and pathogenic organisms, (5) rapid selection of antibody fragments or 
other binding proteins to whole proteomes, (6) rapid antigen identification for anti-cell and 
anti-tissue antibodies, (7) rapid epitope identification for antibodies, (8) high-throughput 
screens for inhibitors of any protein-protein interaction. 

For example, enzymes which could be activated to hydrplyze chromogenic 
substrates only upon binding to target analytes could form the basis of assays for those 
analytes of unparalleled sensitivity and convenience. Such assays would be homogeneous, 
requiring no manipulations .other than the mixing of t\yo componentSii .namely Jhe enzyme 
and substrate, with a. biological specimen, in which the preseneeiof the analyte is then.. 
quantitatively indicated by the rapid development of color. Current homogeneous n 
enzymatic assayst rely on inhibition of the enzyme by binding of anti-analyte antibody to the 
analyte, or mimic thereof, immobilized on the surface of the enzyme (Coty et al., 
J Clin Immunoassay (1994) 17: 144; Legendre et aL, Nature Biotech (1999) 77:67). Free 
analyte is estimated by its ability to competitively displace the antibody, thereby activating 
the enzyme. Such enzymes are thus activated competitively, not allosterically. For assays 
employing such enzymes the maximum signal increment occurs at equilibrium with roughly 

concentrations of reagents, so that typically only a fraction of analyte molecules 
participates in signal generation, and equilibration is often slow or does not even reach 
completion. However, an enzyme which is activated by direct allosteric interaction with 
analyte, can be used in excess, so that equilibration is rapid and independent of the analyte 
concentration, and the analyte can be samrated to produce signal from every molecule. In 
the case of microbial or viral pathogens, where unique surface markers may be present in 
hundreds to thousands of copies per cell or particle, such enzymes, which would be 
activated by binding to the marker, could allow rapid detection of as little as a single cell 
or particle, whereas the sensitivity of equilibrium assays for such analytes would typically 
be much lower. 
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In another class of applications interaction-activated enzymes could be adapted for 
activation by binding to specific cell surface molecules. This would allow the enzyme to 
become localized and activated at specific sites in the body for target-restricted activation 
of reagents for therapy or imaging. Antibody-Directed Enzyme Prodrug Therapy 
(ADEPT; Bagshawe, 1995, supra) is a promising chemotherapeutic strategy for the 
treatment of cancer, in which a prodrug-activating enzyme, such as a P-lactamase, is 
targeted to the tumor by a tumor-specific antibody to which it is chemically or genetically 
conjugated. After unbound conjugate has cleared the circulation, an inactive prodrug, such 
as an anthracycline cephalosporin, is administered, which is converted to a potent tumor- 
killing cytotoxin at the site of the tumor by the remaining tumor-bound enzyme. The main 
problem with ADEPT is that the unbound conjugate must clear the circulation before the, . 
prodrug can be administered in order to minimize systemic toxicity. However, by the time 
.the conjugate has cleared the circulation >i90% of the tumor bound- enzyme has been lost. 
;(Bagshawe,^ \995,\supra\ Springerand NiculesctirDuvaz, Anti-Cancer Drug Design (1995) := U' 
H/6>:361;). In spite of diis, ADEPT has been;abre:^to. achieve :highen active drug .^t\ -..ukv ■ 
concentrations in the tumor than any other jprocedure (Sedlacek et al.r \992 In - s . ; v. 
Contributions to Oncology, Huber H and Queisser V, eds. pp. 208ff Karger, Basel), and- 
has shown promise in the clinic (Bagshawe et aL, Dis Markers (1991) 9:233; Springer and 
Niculescu-Duvaz, 1995, supra\ Martin et at. Cancer Chemother Pharmacol (1997) 
-^0:189). The unbound conjugate problem^could be completely obviated by a prodrugs < 
activating enzyme which would be active only when bound to the tumor, so that the 
prodrug could be administered simultaneously with the enzyme or at the point of peak 
tumor loading without regard for unbound enzyme which would be inactive. 

In the same way, interaction-activated enzymes could be targeted for activation by 
surface markers on the cells of other types of diseased tissues, such as sites of inflammation 
or atherogenesis, or even healthy tissues. The target-localized and activated enzymes could 
then be used to activate not just cytotoxins, but other types of therapeutic agents such as 
small molecule agonists or antagonists of biological response modifiers, as well as imaging 
reagents for precise localization of tissue with disease or other phenotype of interest. For 
example, target-activatable enzymes could be used to deliver: (1) immune stimulants to 
tumors, (2) immuno-suppressants to sites of chronic inflammation or to organ transplants, 
(3) antibiotics to specific pathogens, (4) cytotoxins and anti-virals to virus-infected cells, 
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(5) hormones and other pleiotropic agents to specific cells and/or tissues, or (6) neuro- 
transmitters and other neuro-modulators to specific nerves or tissues. In short, interaction- 
activated enzymes could be used to deliver to any tissue any small molecule cytotoxin, 
hormone, steroid, prostaglandin, neurotransmitter, or agonist/antagonist of peptide 
hormone, cytokine, or chemokine, etc., which could be inactivated by conjugation to the 
appropriate substrate. 

In yet another class of applications, interaction-activated enzymes could be adapted 
for efficient simultaneous detection of multimdes of interactions among proteins within 
cells, including expressed sequence libraries, single-chain antibody fragment (scFv) 
libraries, and scaffolded peptide libraries. For example, enzyme-based interaction traps 
could enable the comprehensive mapping of pairwise protein-protein, interactions within and 
between the proteomes of human cells, tissues, and pathogens for the rapid identification 

^ and validation of new pharmaceutical targets/ They could also be used. for rapid selection^ ■ - 
of binding molecules from single-chain antibody fragment (scFv) libraries;^ or from . 1 - 

. scaffolded peptide libraries ^.for use as reagents in functional. genomicsj studies, or for^ • ;^ - : 
identification of natural ligands and epitopes by;homology. Target interactions identified ^^ -r. 
using interaction-dependent .^-lactamases could be used immediately > to screen for inhibitors 
of the interaction by exploiting the great substrate diversity of these enzymes to reverse the 
polarity of selection. Whereas interaction-dependent activation of P-lactamase could be 
used to confer selective growth on host cells in the presence of p-lactam antibiotics, it 
could also be used to confer selective cytotoxicity on the cells in the presence of P-lactam 
pro-antibiotics. The latter substrates would only become cytotoxic upon hydrolysis of the 
P-lactam moiety by the interaction-activated enzyme, and so could be used to select 
inhibitors of the interaction by their ability to confer selective growth on host cells. 

Finally, enzyme-based interaction sensors could be used for rapid detection of the 
activation or inhibition of key molecular interactions in signal transduction pathways, 
enabling high-throughput cellular screens for inhibitors or activators of those pathways. 
For example, screening for agonists or antagonists of receptor tyrosine kinases usually 
requires coupling receptor ligation to a selectable phenotype which results from de novo 
gene expression. Such multi-step signal generating mechanisms are prone to high rates of 
false positive and false negative selection, like the yeast two-hybrid system, and are 
therefore poorly suited to high-throughput screening. However, interaction-dependent 
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p-lactamases could be set up for activation by phospho-tyrosine sensitive interactions, so 
that a selectable phenotype would be generated just downstream from receptor ligation. 
Interaction between the receptor tyrosine kinase substrate and a binder peptide could be 
designed to be either dependent on, or inhibited by phosphorylation, so that either receptor 
5 agonists or receptor antagonsists could be selected. 

General Strategies for Making High-Performance Enzyme 
Fragment Complementation Systems 

The - present invention provides for general strategies for the use of heterologous 
10 interactors, break-point disulfides, random tri-peptide libraries, and mutagenesis to obtain 

stable enzyme fragments which are capable of forming of calalytically robust complexes. It : . 
has been suggested that it might be possible to identify such fragment pairs for any enzyme 
■ % simply, by conducting thorough searches of all possible fragment pairs (for the enzymes in - ^ . 

i;:= ■/ .^j qutsiion XOstermeier. et al., Proc Natl Acad SGii{^i999). 96:^^ In^practice, however, thevdi'^'i H 
ip 15 i success of such endeavors is strongly dependent^omthe stringencytof; selection, that is,^; howiTK:v. ; -i a 
much functional enzyme must be produced byvthe expressed fragments to produce an , • - h 

efficiently selectable phenotype. An efficiently* selectable phenotype is one in which the 

i! 

□ background frequency, or false positive rate, is not appreciably higher than the frequencies 

I'l ofthe desired fragments in the fragment libraries. ' 

^ 20 In fact the most useful fragment complementation systems for a given enzyme are ■ i 

Q not necessarily those fragments of wild-type sequence which are most capable of unassisted 

complementation, but rather the most useful fragment complementation systems comprise 
those fragments which, when using the engineering techniques described, can be made to 
meet more specific performance requirements. For example, naturally evolved proteins are 
25 generally expected to exhibit a roughly inverse correlation between fragment stability and 
complex stability. This is due to the energy cost of inter-conversion. The more stable the 
fragments are, the more energy is required to form the complex and vice versa. As a 
result, those fragments capable of producing the highest specific activities might be missed 
or dismissed because fragment instability may prevent them from producing selectable 
30 levels of activity. To circumvent such pitfalls, libraries of fragment pairs can be 

simultaneously expressed with libraries of random tri-peptides to insure that every fragment 
pair has a chance to perform in the presence of fragment-stabilizing tri-peptides, thereby 
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minimizing the dependence of the phenotype on fragment stabiUty. This strategy is 
especially useful if dependence of activation on the interaction of heterologous domains 
fused to the fragments is desired. If constitutive activation is desired, the fragment 
libraries could also be amplified by error-prone PCR to introduce fold-accelerating 
mutations which could mitigate both fragment instability and complex instability, as was 
found for p-lactamase. 

For in vitro applications such as homogeneous assays, biosensors, and target- 
activated reagents fragment stability is especially important, but the most stable fragments 
might not be selectable if they cannot produce stable complexes without assistance,: as 
would be predicted by the inverse correlation of fragment stability and complex stability. 
Thus, fragment libraries could be expressed in the E. coli periplasm with a disulfide at the 
break-points and heterologous interactors fused to the break-point termini. These tools 
provide mechanisms for docking the fragments,- accelerating folding, and^stabilizing the 
active complex. As was.shown'with^p-lactamase, a substantial fraction. ofr' fragment pairs 
. can. be. made to produce* robust selectable activity an the bacterial periplasm : with such t . 
molecular prostheses. '\tu . I >^ ; ■ aiv. : ^ 

Each of the four tools described for enhancement of functional reconstitution of the 
parent protein of the fragment pairs, i.e., heterologous interaction, break-point disulfide, 
tri-peptide stabilizers, and mutagenesis, can be used alone or in combination to insure 
selection of the best fragments for the desired application, and also to improve and 
optimize the performance of selected fragment pairs for a desired application. As 
demonstrated, each tool enhances performance by a different mechanism, so that the effects 
of multiple tools are generally additive. Heterologous interactors bring and hold the 
fragments together to facilitate re-folding into the active complex. Break-point disulfides 
can stabilize the active fold by restoring the integrity of the polypeptide backbone at the 
break-point. Tethered or free tri-peptides can protect the fragments from aggregation 
without interfering with folding into the active complex. Mutagenesis can protect the 
fragments by accelerating folding into the active complex. 

tlae first step in the development of high-performance enzyme fragment 
piementation systems is to construct vectors to express each fragment in the fragment 
r library.^ convenient system for selective fragment library expression may be derived 
m the expression system illustrated in Figure 6. All fragment pairs regardless of the 
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intended application can potentially benefit from and woiil^Pnot be impaired by the docking 
function provided by interactors such as the fos andjim helixes fused to the break-point 
termini. Thus, the C-terminal, or co fragment^Hmary would be expressed as N-terminal 
fusions via a flexible polypeptide linker sH^h as a (Gly4Ser)3 linker to the fos helix 
(Interactor 2 in Figure 6) from the ktc promoter in the phagemid vector pAOl (the 
upstream cistron could be removed if desired). The amino acid sequence of the flexible 
polypeptide linker is not ccmcal, however, it must be of a sufficient length and flexibility 
uch that the fragment^omain and heterologous interactor domain fold independently and 
unhindered. The Nnerminal, or a fragment library would be expressed as C-terminal 
fusions via a fl^ible polypeptide linker such as a (Gly4Ser)3 linker to the jun helix 
(Interactoryf in Figure 6) from the trc promoter in the compatible pAEl vector. Coding 
sequen(?es for signal peptides would be included if translocation to the periplasm were 
desired. ■ ■ : s , - 

>m'^ks discussed above,-.depending on whether the ^intended application(s) were in vitro- 
or inmvo',' or if in vivo, whether in the cytoplasm or'^secreted, one. or more .of the 
performance-enhancing tools may be incorporated into the expression vectors :to maximize 
the probability of selecting the best fragment pair for the intended application(s). If 
periplasmic expression is desired, cysteines should be encoded at the break-point termini to 
allow disulfide formation. If the enzyme contains other cysteines, at least 1 mM and not 
more-than 5 mM of a reducing agent such as GSH or DTT should be included in the 
growth medium to inhibit the formation of mixed disulfides. If fragment stabilization is 
desired to increase the importance of specific activity in selection, a random or VRK tri- 
peptide library may be encoded in frame with each fragment fusion between the break-point 
terminus and the flexible polypeptide linker. If VRK libraries were used for each fragment 
in a 50-fragment pair library, every possible tri-peptide-fragment combination would be 
contained in a combined library of < 10^ Alternatively, a single tri-peptide library could 
be used for each fragment pair in trans, as was described above. The tri-peptide library 
would be fused operably in frame via the flexible polypeptide linker to the N-terminus of 
thioredoxin and expressed from the upstream cistron in the pAOl phagemid vector (see 
Figure 6). 

The second step in the development of high-performance enzyme fragment 
complementation systems is to construct an expression library of candidate enzyme 
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fragment pairs. Methods for generating libraries of random fragment pairs have been 
described (Ostermeier et aL, 1999, supra). However, such libraries are quite inefficient as 
the vast majority of fragment pairs will be dysfunctional. For combinatorial screening of 
fragment pair libraries with mutagenic or random tri-peptide libraries, much more efficient 
fragment pair libraries will be necessary. For a variety of reasons it may be assumed that 
the most functional fragment pairs will correspond to scission of the polypeptide chain in 
exposed regions between elements of secondary structure. Exposed break-points will be 
required for use of tethered heterologous interactors and tri-peptides, and scission within 
secondary strucmre elements can irreversibly destabilize such elements. If a 3-dimensional 
structure is available for the enzyme of interest, or for a homolog, it can be used to identify 
exposed loops as candidate sites, for chain scission. Typical globular proteins will not have 
more than 20-25 such sites that are far enough from the ends so that the larger fragment is 
not independently active. This is a manageable number for construction of coding - 
sequences for each fragment pair by PCR.^. Two endTspecific primers- would be required, 
plus a head-to-head pair^of primers for each break-point, which should.be located more or' - , 
less in the center of the exposed .loop. If a 3-d strucmre is not available, reliable ■ , ^ t 
algorithms are available on the internet for computational prediction of secondary structure 
and hydropathy, such as the ProteinPredict program of Rost and Sander {J Mol Biol (1993) 
2i2:584; Proteins (1994) 79:55; Proteins (1994) 20:216), With such programs, most of 
the exposed loops can be identified as hydrophilic regions between secondary istructure 
elements. Again, it would not be excessively burdensome to prepare coding sequences by 
PGR for up to 50 fragment pairs. 

If fragment complementation does not need to be dependent on the direct or ligand- 
mediated interaction of heterologous domains fused to the break-point termini, then fold- 
accelerating mutations could also be selected by using error-prone PGR in the initial 
amplification of the fragment coding sequences. Under appropriate conditions of Mg^^, 
Mn""^, and nucleoside triphosphate concentrations, as well as cycle number, mutagenesis 
can be limited to 1-3 unbiased coding changes per molecule (Gadwell and Joyce, 1995, in 
PCR Primer-A Laboratory Manual C. Dieffenbach and G. Dveksler, Eds. Gold Spring 
Harbor Press, Gold Spring Harbor, NY, pp. 583-590), Since most mutations would be 
non-phenotypic, this could easily be combined with the other performance-enhancing tools 
without compromising the selectability of optimal fragment-tri-peptide combinations. Once 
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the fragment coding sequences have been amplified, gel-purified, and ligated into the 
vectors, the ligation products may be desalted and concentrated to allow efficient co- 
transformation of E. coli cells by high-voltage electroporation. If both the tri-peptide 
libraries and mutagenesis are used it is advisable to collect at least 10^ and preferably at 
least 10^ transformants to insure comprehensive representation of the full diversity of the 
library. The full library is then plated onto each of a range of non-permissive conditions, 
the least stringent being that on which the host cells would plate with an efficiency not 
greater than ten times the inverse of the library size. This would insure a manageable 
frequency of true positives among false positives. The maximum selection stringency 
would be that above which nothing is recovered from the library. 

If fragment complementation is to be dependent on the direct or ligand-mediated 
interaction of heterologous domains fused to the break-point termini, then mutagenesis 
should not be used because folding acceleration usually eliminates the need for docking 
assistance. -f ill this case selected fragment pairs must be.cbuntei--screened for loss of ;/ 
activity inrthe.- absence of the fos^jun interaction and activation indexes must be; determined I 
as the ratio.of interaction-dependent activity to interactiondndependent activity/. For 
interaction mapping within or between proteome librariesjactivation indexes of the order of 
at least 10^ are preferred since rare genes are expected to have frequencies in that range. 
For ligand-specific or interaction-specific biosensors lower activation indexes are usually 
acceptable./ For example, to detect nanomolar concentrations of a ligand for which 
fragment-binder fusion affinities {K^ are in the 10 nM range, the fragment binder fusions 
need only to be used at 100 nM concentrations to saturate the ligand. Under these 
conditions -90% of the fragment-binder fusions will be unbound. If the activation index 
is > 100, the background will be < 10% of the signal. 

Selected fragment pairs can be optimized for maximum activity and/or maximum 
activation index. In our experience break-point disulfides produce the highest specific 
activities because they allow the greatest amount of native structure in the fragment 
complex. However, they also may in the background so that activation indexes are often 
lower. To retain the specific activity benefit of the break-point disulfide and reduce the 
background it may be necessary to retard the rate of disulfide formation so that it would 
not have sufficient time to occur during the abortive attempts of the unaided fragments to 
fold, but would occur efficiently when folding Is catalyzed by the heterologous interaction. 
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Two parameters may be adjusted to control the formation of break-point disulfides. (1) 
The proximity of the disulfide-forming cysteines to the break-point may be adjusted to 
place greater orientational stringency on disulfide formation. (2) The concentration of 
reducing agent in the medium may be increased to reduce the effective concentration of 
DsbA, the principle disulfide-forming oxidase in the periplasm. 

It is possible to use TEM-1 p-lactamase fragment complementation to select 
fragment pairs of other proteins which do not produce selectable phenotypes in E, coli for 
their ability to form stable complexes because such complexes will usually be in the native 
conformation and should be functionally active;. It has been amply demonstrated that ; 
naturally evolved proteins have unique minimum energy conformations in which they are 
stable and active (Li et aL, Science (1996) 273.666), All other conformations are unstable. 

Thus, if a fragment pair library of a non-phenotypic protein is expressed as ftisions to the 
interaction-dependent TEM-1 p-lactamase fragments, it is expected that only those ^'^^ 
^ fragment pairs which associate and.foldinto the native conformation will provide sufficient 
docking function to facilitate: selectable priactamase, activation. In this case,, the subject, vr a t 
-^fragments serve the purpose of the heterologous interactors in facilitating complementation f - 
of P-lactamase fragments. However ,radditional modifications could be encoded into the - 
fragment/heterologous interactor fusion sequences to enhance functional reassociation of 
the p-lactamase fragments, including a break-point disulfide, a randomly-encoded peptide 
of from 3-12 amino acids, and mutagenesis of several amino acids within the fragment 
domain. All of these tools would specifically impact only complementation of the subject 
fragments by stabilizing the fragments, accelerating folding, and/or stabilizing the active 
fragment complex. Selected fragment pairs could then be tested individually for 
reconstitution of enzymatic activity or other function of the parental protein. In this way 
many useful fragment complementation systems could be developed for proteins which are 
active in eukaryotic cells, such as kinases or herbicide-resistance proteins. 

The interaction-activated enzyme association systems of the subject invention, as 
exemplified by prokaryotic p-lactamase, find use in many applications as summarized 
below. 

(1) Simplex and multiplex protein-protein interaction mapping. Simplex refers to the use 
of single bait proteins to fish natural interactors out of expressed sequence libraries. 
Multiplex refers to the combinatorial pair-wise interaction of two expressed sequence 
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libraries for the purpose of simultaneously isolating as many natural interactions as 
possible. Individual interactors can be readily identified by nucleic acid hybridization. 
(2) Interaction-dependent p-lactamase systems may also be used to enrich randomly- 
primed expressed sequence libraries for fragments which encode autonomously- 
5 folding domains (AFD). Interference with folding by the fusion partner is avoided by 

using epitope tags and hetero-dimerizing helixes only at the N- and C-termini of the 
expressed sequence, respectively. The fragments would have N- and C-terminal anti- 
tag binder and the partner hetero-dimerizing helix. The disulfide switch can 
accommodate diverse. interaction geometries. 
10 (3) Simplex and multiplex selection of binding molecules such as single chain antibody 
fragments (scFv) and antibody light chain variable regions (VL). Non-immune 
human scFv repertoire libraries can be used with TEM-1 p-lactamase interaction- 
dependent activation systems to isolate scFv to single baits or simultaneously to - 
; expressed^sequenee libraries. In the latter case scFv specific for /individual targets:*, 

15 . . can be readily; identified by. nucleic acid hybridization. • [tvs v \ . . ^'s . ■ . 
n "■ ^ • (4) Interface mapping and ligand identification by mimotopevhomology .? Constrained rt 
peptide libraries displayed on the surface of a carrier or "scaffold" protein may be 
used with p-lactamase interaction-dependent activation systems to isolate surrogate 
ligands'for proteins or AFDs of interest. Consensus sequences from panels of such 
20 surrogate ligands for a given polypeptide may then be used to identify natural ligands 

of the polypeptide or interaction surfaces on natural ligands of the polypeptide. A 
common application of interface mapping is epitope mapping for antibodies, whereby 
the specific region to which an antibody binds on the surface of its antigen is 
identified. 

25 (5) Bio- Action Sensors. The efficiencies of most screening systems for signal 

transduction agonists and antagonists are compromised by the need for multiple steps 
between receptor ligation and selectable phenotype generation, which usually requires 
de novo gene expression. Interaction-activated P-lactamases can be tailored for 
activation or inhibition by any component of a target signal transduction pathway to 
30 allow selection of agonists or antagonists of the pathway in any appropriate cell type 

without the need to wait for gene expression to generate a selectable phenotype. 
(6) Homogeneous Assays. Interaction-dependent complementing fragments can be fused 



!.0 



o 



PATENT 



28 



ATTORNE 



CKET NO. PARE.002.01US 



to two scFv or other binding molecules which bind non-overlapping epitopes on 
target molecules, so that p-lactamase activation becomes dependent on binding to the 
target ligand. The use of ligand-dependent p-lactamases in homogeneous assays for 
two-epitope analytes from proteins to pathogens affords unparalleled sensitivity 
5 because saturation kinetics can be used instead of the equilibrium kinetics required by 

most assays. The binding molecules could also be oligonucleotides which anneal to 
contiguous sequences in the genome of a target pathogen. Such sequence-activated 
P-lactamases could also be used for rapid quantitation of specific PCR products 
without the need for gel eletrophoresis. 
10 (7) Target- Activated Enzyme Prodrug Therapy (TAcEPT) and Target- Activated Enzyme 
I , Imaging (TAcEI). Antibody-directed enzyme prodrug therapy is a promising chemo- 
therapeutic strategy in which patients are treated with prodrug-activating enzymes 
' :u such as p-lactamase conjugated to tumor-targeting antibodies (Bagshawe, 1995, ■ ;if . 

m supra). When unbound antibody-enzyme conjugate has cleared the circulation, :■ :>^^.> *v 
-V. prodrugs can be administered which: are preferentially -activated, at the site of the :;^:^t. 
■>f '?tumor. The efficacy of this therapy* is severely; limited by the need for unbound u 
; conjugate to clear the circulation before the prodrug can be administered in order to% - 
avoid excessive toxicity, during which time most of the bound enzyme is lost from 
the tumor. The use of tumor-activated p-lactamases allows the prodrug to be 
administered at peak tumor loading of the enzyme since the latter is inactive in the i 
circulation, and can only activate the prodrug when bound to the tumor. The same 
strategy can be used for antibody-directed site-specific activation of reagents for 
imaging of tumors or other tissue pathologies, or for other therapeutic indications 
such as inflammation or transplant rejection. 
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The following examples are offered by way of illustration of the present invention, 
not limitation. 
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EXPERIMENTAL 
EXAMPLE 1 

P-lactamase Activation by Interaction-Mediated Complementation of al97 and ©198: 

Interactions between scFv and trxpeps 
This example demonstrates the abiji^j^f the system to detect and discriminate 
specific interactions between singlp'^fliain antibody Fv fragments (scFv) and 12-amino acid 
ides by inserted into thg^floive site of £. coli thioredoxin (trxpeps, Colas et at,. Nature 
96) 550:548). Scpc^re comprised of antibody heavy chain and light chain variable 
regions (VH apd^L) tethered into a continuous polypeptide by most commonly a 
(Gly4S^)f1inker encoded between most commonly the C-terminus of VH and the N- 
terarfnus of VL. 

ScFv from a-human non-immune antibody repertoj.p^ere amplified by PCR using'-?- 
a consensus primer mix. (Marks. ^z/., Eur JJmmum/l^l99li)'2I:9S5)Ymd^subc\om^ into/a 
:pUC119-based phagemid vector (Sambrook^^/^ supra) for expression of the;scFv as. 
fusions to the N-terminus of the ©198 fragtnent with an interveningj(Gly4Ser)3 linker t. 
AOl; see Figure 6A):' An* N-temijjlal signal peptide was provided for translocation to 
e bacterial periplasm. A coimarfercial trxpep library was obtained and amplified by PCR 
using primers specific for tfie N- and C-termini of E. coli thioredoxin (Genbank accession 
0. M54881). This-pnjfluct was subcloned into a pl5A replicon (Rose, Nuc Acids Res 
(1988) 76:355) fop/^pression as fusions to the C-terminus of the al97 fragment from the 
trp-lac fusion^omoter (pAEl; see Figure 6B). Again, an N-terminal signal peptide was 
provided fefr translocation to the periplasm. Figure 7 illustrates the activation of TEM-1 by 
compJg!mentation of a 197 and col 98, mediated by interaction between an scFv and a 
tnepep. 

It was estimated that about 20% of the original scFv library clones produced 
soluble, full-length scFv as judged by immunoblot analysis (Harlow and Lane, (1988) In 
Antibodies: A Laboratory Manual Cold Spring Harbor Laboratory Press, Cold Spring 
Harbor) of periplasmic extracts obtained by osmotic shock (Neu and Heppel, / Biol Chem 
(1965) 240-36%5). Thus, approximately 60 clones had to be screened in this way to obtain 
twelve clones expressing functional scFv. Plasmid DNA representing these twelve clones 
of the scFv-col98 construct was co-transformed with DNA representing approximately 
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5x10^ clones of the al97-trxpep construct into E. coli strains DH5a and TGI (Sambrook et 
aL, 1989, supra), and plated onto solid LB medium containing kanamycin and 
chloramphenicol to determine the total number of co-transformants. Aliquots were also 
plated onto 25 /ig/ml ampicillin (amp25). Out of approximately 1x10^ total co- 
transformants, 40 ampicillin-resistant clones were recovered, 36 of which replated on 
amp25. A similar number of co-transformants of a single randomly selected al97-trxpep 
construct with the twenty scFv-coi98 constructs produced no colonies on amp25. All 
twelve scFv were represented in the 36 ampicillin-resistant clones with from one to five 
different trxpeps each. None of the 12 scFv cross-reacted with any trxpep originally 
selected by another scFv, as determined by co-transforming each scFv-col98 construct with 
a pool of the al97-trxpep constructs selected by the other scFv. Thus, all 36 selected 
clones were bona fide positives, representing unique and specific scFv-trxpep interactions. 
iNo scFv bound thioredoxin in the absence of its-peptide mimotope(s),-and no selected ' ?x 
trxpep bound common determinants on the scFvscv .Selections were performed in the E: coli.,.:x:- 
host strain TGI without.the graUiitous de-repressor oftthe^/(3c promoter, isopropyl '.^ r 

thiogalactoside (IPTG), so that transcription wasrihinimal. >When transcription was 
increased by the presence of 1 mM IPTG, many moreccolonies were obtained. Several of v 
these were shown to be bona fide interactions which were too weak to confer selectable 
ampicillin resistance at lower levels of expression; Thus, the stringency of selection can be ^ • 
tuned by adjusting the expression levels of the interactors. - vi 

These results have several important implications. First, the false positive rate was 
exceedingly low, much lower than has been reported for other intra-cellular interaction 
sensors such as the yeast two-hybrid system (Bartel et al, 1993, supra\ Bartel et aL, 1996, 
supra). This property is essential for high-throughput applications. Secondly, the false 
negative rate with respect to the scFv was immeasurably low, as trxpeps were recovered 
for all ftinctional scFv, and this too is essential for high-throughput applications. The fact 
that mimotopes were recovered for all scFv enables the system for high-throughput 
multiplex epitope mapping for scFv. Finally, the system is capable of efficient recovery of 
multiple interactions between two diverse populations of proteins simultaneously. 
Ultimately, given the high efficiency of the system, i.e., low rates of false positive and 
false negative selection, the throughput of the system should be limited only by the sizes of 
the interacting libraries, and/or the number of co-transformants which can be handled 
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conveniently. For example, construction of recombinant protein libraries in the 10^-10'^ 
range is routinely possible for scFv, trxpeps, or cDNAs (Hoogenboom et al., Immunotech 
(1998) 4'A). Combinatorial pair-wise interaction trapping for any two such libraries would 
require at least 10'^-10^° clones, but with quantitative phagemid infection methods 
(Sambrook et aL, 1989, supra) and automated fermentation and plating methods, such 
throughput levels could be realistically achieved. 



Example 2 

P-lactamase Activation by Interaction-Mediated Complementation of al97 and g) 198: 
Interactions between antibody light chain V-regions (VL) and trxpeps 

\his example demonstrates the ability of the system to work with larger antibody 
fragments ^^ch as Fab, which are comprised of entire light chains disulfide-bonded to Fd 
fragments whns;h contain VD plus the first heavy chain constant region. A. subset of Fabs 
from a human reWtoire library was subcloned for expression as C-terminal ca 198: fusions 
A ! from- a dicistronic transcriptvfrom -the lac promoter in the pAOl vector (seel/Figure 6A). .'-m- -i-, - 
Ttre'. first; cistron encoded, the light chain with a signal peptide for translocation to the ^ 
^/riplasm. The light chainVtermination codon was followed by a short spacer sequence and 
tpen a ribosome binding site approximately 10 bp upstream from the start of translation for 
the signal peptide of the Fd fraginent, which was followed by col98 with an intervening 
(Gly4Ser)3 linker. This construct ^s then co-expressed with the al97-trxpep library in the 
pAEl vector in strains DH5a and TOJ . Spontaneous association of the light chain with the 
Fd-col98 fusion protein in the periplasnkwas expected to produce a functional Fab 
fragment. Binding of the latter to the pep^de on a al97-trxpep fusion was then expected to 
facilitate assembly of the functional TEM- 1 Vlactamase in amounts sufficient to confer 
selectable resistance to ampicillin on the host c^ls. 

Many clones were in fact recovered on 25/ig/ml ampicillin. Some of these are 
listed in Table 1 below. Several were resistant to up to 100 /zg/ml and one was resistant to 
up to 600 ^g/ml. Unexpectedly, all recovered Fabs were missing the VH region. That is, 
they contained the full-length light chain (LC) with only the first heavy chain constant 
region (CHI), The reasons for this were as follows. The original Fab library was 
constructed by first inserting the VL repertoire into the vector which already contained the 
constant regions ready for expression. This intermediate construct was capable of 
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expressing a complex of the light chain with the first heavy chain constant region fused to 
0)198. Plasmid DNA was then purified from this light chain library and used as the 
recipient for insertion of the VH repertoire to complete the Fab library. The resulting 
library was contaminated with approximately 15% of clones which contained the 
intermediate vector. Only these LC-CHl complexes were capable of driving al97-a)198 
complementation by binding of the VL combining site with the peptide on the appropriate 
trxpep. It is not known why full-length Fabs were not selected, however, the larger size 
and rigidity of the Fab-trxpep complex ( — 67 kDa) may have sterically inhibited fragment 
complementation, whereas the smaller size and flexibility of the'LC-CHl complex did not. 

TABLE 1. 

\^ lAmpicillin-Resistance of TEM-1 P-lactamase a 197/0)198 Fragment 
Complementetion Driven by Interactw^ 

* Antibody Light Chain-Cni Complexes an^^^^ 









^ iLC-CHl - 


Trxpep 


■ AmD^ 


5 P44-2-2B1 


P44-2-2A1 


+ + + + +^- 


P44-2-3B1 


P44-2-3A1 


+ + 


' P44-1-6B1 


P44-2-6A1 




P64-17B1 


P64-17AI- 


+ -1- 


P65-1-10B1 


P65-1-10A1 


+ + + 


P66-3-2B1 


P66-3-2A1 


+ + 


P66-3-10B1 


P66-3-10A1 


+ 


P66-3-14B1 


P66-3-14A1 


+ + 


P75-7-7 


? 


>+ 


P75-7-13 


7 


>+ 


P75-7-30 


? 


>+ . 



^ +,+ + ,+ + + ,+ + + + + , >10% plating efficiency on 25, 50, ICQ, 600 ^g/ml ampicillin. 

This result shows that light chain V-regions alone, which are only — 12 kDa in size, 
could make convenient high-affinity binding molecules for antigen-dependent activation of 
p-lactamase by fragment complementation. To test this, the VLs from several of the 
selected LC-CHl were subcloned for expression alone as C-terminal fusions to 0)198. 
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When each was co-expressed with its partner al97-trxpep, approximately one-third of the 
VL conferred selectable resistance to ampicillin comparable to the parent LC-CHls. 

Example 3 

P-lactamase Activation by Interaction-Mediated Complementation of a 197 and ©198: 

Interactions between CD40 and trxpeps ^ 

This example demonstrates the ability of the present system to isolate panels^ 
trxpeps that bind to a given protein of interest, and which could be used to mao/interaction 
surfaces on the protein, and which could also assist in the identification of n^ ligands by:- 
homology. The extra-cellular domain of the human B-cell activation antigen CD40 is 
known to reliably express in the E. coli periplasm (Noelle et al., Inmunol Today (1992) 
23:431; Bajorath and Aruffo, Proteins: Struct, Funct, Genet (1991) 27:59). A T-cell 
sm-face molecule, CD40 ligand (eD40L), is known to co-acjivate B-cells by ligation to . 
CD40, but there may be other ligands/ Therefore, TE?^J/l a 197/0)1 98 fragment. ;- 
complementation was used to select a panel of CD4Q^inding trxpeps. The sequences^. of r^- \ * * 
^ese peptides would then^be examined for homology to the known ligand and other • ; . 
potential ligands. The coding sequence for -tjae mature form of the extra-cellular .domain 
(CD40ED) was amplified by PCR using^Jrimers homologous to the N-terminus of the 
mature protein and to the C-terminu^/of the - 190-residue extra-cellular domain (Genbank 
accession no. X60592). The PCJt^roduct was then subcloned into the pAOl phagemid 
vector (Figure 6A) for expre^on from the lac promoter as a C-terminal fusion to the 
TEM-1 CO 198 fragment wKn an intervening (Gly4Ser)3 linker. Expression of the correct 
product was confirmaef by PAGE, and the CD40 fusion vector was then rescued as phage 
and transfected injo TG-1 cells bearing the same trxpep library construct as described 
above. Approjumately 10^ co-transformants were collected by double selection on 
kanamycin/and chloramphenicol, and then plated onto 25/ig/ml ampicillin. Activation of 
TEM-l/oy a trxpep-CD40 interaction-mediated complementation of a 197 and col98 is 
depicted in Figure 8. 

Ampicillin-resistant clones encoding thirteen unique trxpeps were recovered. In all 
cases amp resistance was strictly dependent on the presence of CD40ED and the peptide 
portion of the trxpep. No activity was seen if CD40ED was replaced with an irrelevant 
protein or if the trxpep was replaced by wild-type thioredoxin. The sequences of the 
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selected CD40-binding peptides are shown in Table 2 below along with their homologies to 
each other and to CD40L. The thirteen peptides sort into eight homology groups: two 
groups with three each (1 and 2), one with two (3), and five with one each. Groups 1 and 
2 are defined by homology of three peptides in each group to the same region of CD40L. 
Group 1 is homologous to the region of CD40L from Pro217 to Gly234, and Group 2 is 
homologous to the region from Glyl58 to Leu 168. Group 3 is defined only by inter- 
peptide homology and has no detectable homology to CD40L. Group. 4 is homologous to 
CD40L from SerllO to Prol20, and Group 5 is homologous to CD40L from Pro244 to 
Gly257. Groups 6-8 have no discemable homologies. However, a number of the peptides 
had striking homology to other human extra-cellular proteins, including CTLA-2A, a 
matrix metalloproteinase, a receptor Tyr phosphatase, vascular endothelial cell growth 
inhibitor (VEGI), transferrin receptor, CD3<^, and bone morphogenetic protein 3B (BMP- 
3B). These may define an interaction motif or motifs, which have been used repeatedly for 
extra-cellulanprotein-protein interactions. They may alsor indicate multiple interaction sites ,. 
onCD40; ^ > ^ : .. . .:. ..Ui-^'- - 

' Interrtrxpep competition was tested by expressing each of five selected CD40- 
binding trxpeps from a second cistron in the pAOl phagemid vector, downstream from the 
CD40 - 0)198 fusion. Each of these constructs was then co-expressed with each of the 
same five' plus three additional selected al97-trxpep fusion constructs in strain TGI and 
scored for growth on 25 /^g/ml ampicillin. The results are:shown in Table 3 below. The 
eight trxpeps sorted into five groups. BWlO-1 competes moderately with groups 2 and 3. 
p58-12-9Al, BWlO-4, and BWlO-8 compete strongly with each other and have similar 
competition profiles. They do not compete with group 3, except for BWlO-8, which 
competes slightly with group 3 and BWlO-9. All three compete with BWlO-1, and p58-12- 
9A1 also competes slightly with BWlO-9. p44-4-2Al and p45-7-2A3 compete strongly and 
have similar competition profiles. They compete with BWlO-1 and nothing else except 
BW 10-8 slightly. BW 10-9 competes slightly with BW 10-8 and p58-12-9Al. p65-2-9Al is 
inhibited by nothing. 
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In general, the competition data is consistent with the homology data with the 
caveat that simultaneous binding to non-overlapping epitopes is sometimes not tolerated. 
This allows unrelated sequences like p58-12-9Al and BWlO-8 to compete strongly with 
one another and have similar competition profiles. This is probably due to steric 
interference with enzyme reassembly, and may account for the discordance between 
homology and competition data for BWlO-1 and p58-12-9Al in particular. These two 
probably bind near the same CD40 interaction epitope, which may sterically inhibit 
fragment complementation for many (but not all) other trxpeps. > 

For some applications it will be useful for P-lactamase activation to be mediated by 
simultaneous binding of both al97 and ©198 to non-overlapping epitopes on a separate 
molecule, either a free ligand or cell surface receptor. Two CD40-binding trxpeps, which 
had been identified as non-competing by the competition tests, were used to test this utility. 
One of the two trxpeps was subcloned for, expression as the C-terminal col98 fusion; from- : 
the pAOl vector (see Figure 6). The other. trxpep was expressed as the al97 fusion from 
the pAEl vector as before. Co-expression of these two constructs was used as the negative . 
control. To test for CD40-mediated activation, the CD40ED coding sequence (including • 
signal peptide) was subcloned into the trxpep-col98 expression cassette between the 
promoter and the trxpep-(jf)198 sequence. An additional 20 bp containing a ribosome 
binding site was included downstream from the CD40 stop codon to allow expression of 
both CD40 and trxpep-col98 from the same dicistronic transcript, as was described above 
for the Fab. As shown in Table 4 below, CD40 expression induced resistance to 50 fxg/mi 
ampicillin, whereas without CD40 the cells expressing the control constructs produced 
fewer than 10"^ colonies per cell on 25 /ig/ml ampicillin. Thus, ^-lactamase fragment 
complementation can be efficiently induced by a tri-molecular protein-protein-protein 
interaction. 
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Table 4 



Ligand activation of TEM-la/co fragment complementation using 
non-competing CD40-binding trxpeps and CD40ED. 



Molecule#l 



MoleculejS^l 



Molecule^3 



Amp^ 



a-p44-4-2 



CD40-G) 



a-p44-4-2 



CD40 



BWlO-l-o) 



a-p44-4-2 , - BWlO-l-co 

a. plating efficiencies on 25 /Ag/ml ampicillin in colonies per cell. < 10-6; 
+ , >10%; + + , >25% + + + , >50%. 



Example 4 



p-lactamase Activation by Interaction-Mediated Complementation of al97 and ©198: 



Since p-Iactamase activation by, al97-o)198 fragment complernentation could be - 
driven efficiently, by interaction between scFv and trxpeps, it \vas important to show that it 
could also be driven by interaction between scFv and a bona fide protein. antigen, 
preferably a cell surface receptor. This was especially important because the ligand- 
binding domains for type 1 trans-membrane receptors are N-terminalj therefore their 
expression as C-terminal fusions is preferred. However, the preferred orientation for scFv 
expression is also N-terminal. To allow expression of both scFv and antigen as C-terminal 
fusions, p-lactamase activation by a tri-molecular interaction was tested, including the C- 
terminal fusion of the scFv with col98, a C-terminal fusion of CD40 with the fos helix, and 
a C-terminal fusion of al97 with the jun helix. The expression constructs were analogous 
to those used for CD40 ligation of the trxpep-fragment fusions. The CD40-fos fusion and 
the scFvo)198 fusion were, expressed from a dicistronic transcript in the pAOl vector, and 
al97-jun fusion was expressed from the pAEl vector. The fos-jun interaction has a in 
the 10"^M range, so it should quantitatively ligate CD40 with al97, which are much more 
abundant than this in the periplasm. Binding of the scFv to CD40 should then dock col98 
with the complex to facilitate fragment complementation. As shown in Table 4, CD40-fos 
expression induced resistance to up to lOO^g/ml ampicillin, whereas cells expressing only 
the control constructs without CD40-fos again produced fewer than 10'^ colonies per cell 



Interaction between a GD40-specific scFv and GD40 
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on 25 /Ag/ml ampicillin. Thus, p-lactamase fragment complementation can be efficiently 
induced by a tri-molecular interaction of two extra-cellular proteins in preferred C-terminal 
fusions. 

Example 5 

Disulfide-Enhanced Fragment Complementation 

The p-lactamase activity produced by interaction-dependent complementation of the 
al97 and col98 fragments is substantially less than that of the wild-type enzyme under the 
same expression conditions. This loss of activity could be due to a tendency of the 
fragments to aggregate or turnover when they are not folded into the native conformation, 
and it could also reflect a loss of specific activity due to the reduced ability of the loosely 
tethered heterologous interaction to stabilize the native conformation. It was reasoned that 
both folding kinetics and stability could be enhanced by the introduction of a disulfide at - . 
the break-point, and this could lead to a substantial increase in interaction-dependent . . -i^ - 
'activity. The expectation was that when the fragments were docked by the heterologous ic ; • 
interaction, the integrity of the polypeptide backbone would be restored at some point in, 
the folding pathway by the formation of a disulfide linkage between cysteines added at the 
break-point, and this would accelerate folding and/or stabilize the active conformation. 
The disulfide would form very rapidly in the highly oxidizing environment of the bacterial 
periplasm. However, if the fragments were unstable until they were docked and folded; 
but once folded the activity was stable, then the break-point disulfide might have little 
effect on activity if it did not form until late in the folding pathway. 

Cysteines were added to the sequences of a 197 and cal98, between the break-point 
termini and the linkers leading to the heterologous interactors. With the fos and jun 
helixes as the interactors, quantitative ampicillin resistance (> 10% plating efficiency) 
increased from 50 fig/ml to more than 100 |ug/ml, and the plating efficiency on 25 |ag/ml 
ampicillin increased at least 2-fold. Thus, disulfide formation must be accelerating folding 
and/or stabilizing the active conformation. However, the disulfide produced nearly as 
much activity without the interactors. This contrasts sharply with the activity of the 
fragments in the absence of either the disulfide or interactors, for which plating efficiencies 
are less than 10"^ on 25 \xglm\ ampicillin. This result suggests that the fragments probably 
associate and refold readily on their own at these intra-cellular concentrations, but that 
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without a heterologous interaction or disulfide at the break-point, either folding cannot 
progress to the active conformation, or the latter is not stable enough to produce selectable 
activity. There must be a finite window of oppormnity for disulfide formation when the 
thiols are proximal during unassisted folding. This window should be much wider during 
interaction-assisted folding. Thus, it should be possible to retard disulfide formation and 
thereby make it more dependent on the heterologous interaction. 

Disulfide formation was made to be more dependent on the heterologous interaction 
by two modifications. First, disulfide formation could be inhibited by inclusion of a 
reducing agent in the growth medium. Dithiothreitol (DTT) at 10 mM reduced the plating 
efficiency of the disulfide-assisted fragments on 100 |ag/ml ampicillin to < 10^ colonies per 
cell in the absence of an interaction, whereas with the fos-jun interaction the activity of the 
same fragments was little affected by DTT, so that the activation index was increased to 

> 1000-fold. Secondly, the cysteines were shifted by one residue each aw ay -from the ^ 
break-point' and into the p-lactamase sequence, so that they became. separated in the native 
fold by an additional- - 8 A. This reduced activity to a plating efficiencysof .< 10^ on 50 

^Hg/ml ampicillin without the interaction, whereas with the fos-jun -interaction the plafing 
efficiency was reduced to/ — 10% on 50 |ag/ml ampicillin for an activation index of > 10^. 
Thus, a combination of reducing agent and thiol separation may be expected to increase the 
increment of interaction-dependent activation over background even further, perhaps to 

> 10^. In any case the 8 A increase in thiol separation alone increased the acfivation 
increment substantially over that of the fos-jun interaction without disulfide. The 
enhancement of interaction-dependent specific activity provided by the disulfide should 
allow weak interactions and/or poor expressors to produce selectable p-lactamase activity 
with fewer than 10 molecules per cell of the activated enzyme. 

The ability of the break-point disulfide to enhance activation of TEM-1 al97/col98 
fragment complementation, suggests that break-point disulfides might be able to activate 
many enzyme fragment pairs which produce weak or no selectable activity with a 
heterologous interaction alone. The heterologous interaction may be essential for fragment 
docking, but since it is tethered with — 60A linkers it cannot restore the tight junction of 
the polypeptide backbone at the break-point. However, formation of a disulfide across the 
break-point should restore the integrity of the backbone, and should thereby help stabilize 
the active site of the complex. This idea was tested by screening nine additional pairs of 
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TEM-1 p-lactamase fragments, corresponding to scission in nine exposed loops of the 
polypeptide chain. The nine fragment pairs were screened for selectable activity with the 
break-point disulfide alone, the fos-jun interaction alone, and with both together. The 
results are summarized in Table 5. 

Addition of the break-point disulfide to the fos-jun interaction strongly increased the 
activity of seven of the nine fragment pairs, which makes eight out of ten pairs when 
a 197/0) 198 is included. The ten fragment pairs may be sorted into three groups. One 
group comprises the two negative pairs. The second group comprises three pairs which 
can only^ be activated by disulfide and fos-jun interaction together. In each case, the 
plating efficiency is at least 10% on 25 |ig/ml ampicillin, with an activation index of at 
least 1000. The third group comprises five pairs, all from break-points in the C-terminal 
third of the molecule, which produce modest-to-robust activity with fos-jun alone, but 
~ potent activity with both fos-jun and the disulfide together. Most importantly, four of thev- . 
. five produce no selectable activity with the disulfide alone, so they have very large . V 
; activation indexes. P174/N175 had the highest iactivation index*, .S- 10^ on 100 |ig/ml v 
ampicillin. G253/K254 had the highest activity with a plating efficiency of >25% on 400 - 
|ig/ml ampicillin. Interestingly, the first fragment pair identified to exhibit interaction- 
dependent acfivation, al97/col98, remains the only pair to produce robust selectable 
activity with the break-point disulfide alone.' It is possible that activation of some pairs is; : 
inhibited by the formation of mixed disulfides between the break-point cysteines and the j 
internal cysteines, and it is also possible that such inhibition could be alleviated with 
exogenous reducing agent. However, it is at least as likely that in these cases unassisted 
refolding could not proceed far enough to allow efficient formation of the break-point 
disulfide before aborting. 



PATENT 42 ATTORNEY DICkET NO. PARE.002.01US 



ID 

H 



a 
%-» 



3 



O 

4> 



S 
"3 



a 

S 

s 

o 

u 

e 
2 

s 



s 
s 

£ 



13 

o 

+ 



^1 
+ 1 



© 

+ 

+ 



a 
B 
< 



S 



in 

Q 

s 



D 

E 



o 



I I t I I I I I I I 



I I I I I I I I I 



I I t I I 



I I I I I 



I I I I i 



I I I 



IT) 



ri 

IT) 



+ I 



o o 

ID <S 



1/^ 



I I 



ON 

O 

ON 



ON 
IT) 



^ --^ 

5 I i 



00 
IT) 



o o o 
o o o 

1— I ^ 



ON 



in 00 ^ 

ON 



in in 



o 
o 



00 



£ S S 



+ 
t 



I I I 



I t I I I 



Si . 



o 
o 



I 



^ -3 



a. 



o 
c o 

& I- 

E ° 

■4-' 

TD ca 
c ^ 

O 

.s 



o 



ca to 

<1> 



"a 



c 
o 



-2 
"o 



"a. 



V jrt 



o 
c 
a> 

"o 

e 

o 

H 

crt O 

X + 
on fo 

■s S 

< o 



X 

c 

CiJO 
CO 

u 

I 

c 

o 

1 

zl. 



c 
o 



o 
c 
o 



a 

E 

ca 

£ 

3 

.1 
*X 

ca 

E 
-5 



t: 6 
< c 

X 'o 



PATENT 



43 ATTORN^WOCKET NO. PARE.002.01US 



The fact that the fragment pairs which produced the highest activities are not the 
same as those with the highest activation indexes and vice versa, indicates that different 
fragment pairs may be optimally suited for different applications. For example, the 
activation index is more important than maximum activity for intra-cellular interaction 
mapping, where namral interactions must be identified against backgrounds of 10^ or more 
non-interacting pairs. Thus, P174/N175 may be the best fragment pair for intra-cellular 
interaction mapping. On the other hand, maximum activity is more important than the 
activation index for in vitro applications because the activating target ligands will always be 
limiting in such applications. Since for maximum activation the fragments need only be 
used in ten-fold excess over their K^s for the ligand, the activation index need only be 1000 
for a signal-to-noise ratio of 100. Thus, G253/K254 may be the best fragment pair for in 
vitro applications such as biosensors or homogeneous assays. i 

: The break-point disulfide overcomes a significant: shortcoming of inter^efion- 
dependent enzyme fragment complementation systems.- It is essential fop^gh- throughput . 
■■ applications that such systems be capable of efficients activation by,.a^ide range of . . . 
heterologous protein-protein interactions. In other words, to^mnimize the false negative 
4e, the system must be activatable by any interaction hi^fween two proteins or fragments 
^ v^thin the size range of single, naturally evolved pj^in domains, i.e., between - 100 and 
300 amino acids in length. Globular proteins Lil^his size range have radii in the range 
-3t)-50A. This means that the points of attachment for the linkers could be up to 100 A 
apart, and this distance must be spann^by the linkers in order for the break-points of the 
fragments to be able to come togejHer. For this reason, the (Gly4Ser)3 linker was selected, 
which is expected to be fully ©iOended and flexible, and to have a length of -60A, thereby 
providing a combined lengfh of up to 120A to allow close approach of the break-point 
termini during foldin&r Nevertheless, it is reasonable to expect the stability of the active 
conformation to h^uite sensitive, and generally inversely proportional to the dimensions 
of the heteroh^us interaction. Thus, for all such systems described to date it may be 
assumed tKat the longer the linkers, the larger the proportion of possible interactions that 
can ao<?ommodate refolding, but the less the interaction can contribute to stabilization of the 
acTwe conformation. 

The break-point disulfide overcomes this limitation because, if the linkers are long 
enough, it will form readily during re-folding, and once the break-point disulfide is formed 
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the specific activity of the reconstituted enzyme should be independent of the dimensions of 
the heterologous interaction, and in fact should not even require the continued integrity of 
the interaction. Thus, the break-point disulfide acts as a one-way switch, with an 
activation energy which can be supplied by a broad range of heterologous interactions, 
limited only by the ability of the interactors to fold properly, and by the length of the 
linkers to allow close approach of the break-point cysteines. This has two important 
consequences which allow a larger proportion of natural interactions to produce selectable 
activity. Longer linkers can be used, and interactions which are too weak to sustain 
selectable enzyme activity by themselves should still be able to "throw the disulfide 
switch" to produce selectable activity. 

Example 6 

Peptide-Enhanced Fragment Complementation : i 

Another way to enhance interaction-dependent enzyme fragment? complementation is; - 
to, introduce short, random peptide sequences at the. break-points, andithen to^select for tr.^ 
increased activity with a model interaction. Such peptide-dependent enhancements could 
occur by any of several mechanisms. For example, the peptides could stabilize the active 
conformation of the reconstituted enzyme by interacting with each other or with the enzyme 
itself, or the peptides could stabilize one or both of the fragments, thereby increasing 
steady-state activity by increasing fragment concentration. 

Synthetic oligonucleotides were used to add three randomized residues to each 
fragment between the break-point residue and the linker for the heterologous domain. As 
the model interaction, the c-fos helix at the N-terminus of ©198 and the c-jun helix at the 
C-terminus of a 197 was used. For each randomized position, a degenerate codon was 
used, which encoded a subset of amino acids which was biased toward charged residues to 
favor charge-charge interactions, which are the strongest. The VRK codon places c, a, or 
g in the first position, a or g in the second position, and t or g in the third position. The 
encoded amino acids are His, Gin, Arg, Asn, Lys, Ser, Asp, Glu, and Gly. For three 
randomized positions in both fragments there are a total of 12^ = 3x10^ possible codon 
combinations, and 9^ = 5.3x10^ possible different amino acid sequences. Initially, ten 
thousand clones of the library were plated onto successively higher concentrations of 
ampicillin until no colonies were recovered. Six clones in the DH5a strain were recovered 
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from 800 ^g/ml ampicillin, and all six showed strict dependence on the fos-jun interaction 
for growth. In fact, the jun helix was removed from al97 in the same starting IC^ clones 
of the library, and when these clones were plated onto the same concentrations of 
ampicillin, only a few colonies grew on 200 ^g/ml ampicillin, and no colonies appeared on 
higher concentrations. This level of ampicillin resistance is comparable to that produced 
by the fos-jun interaction alone. 

Unexpectedly, all six selected clones recovered from DH5a had the same a tri- 
peptide, Gly-Arg-Glu (GRE), and each had a different o tri-peptide. When the co tri- 
peptides were removed, there was no significant reduction in activity, suggesting that the 
ability of the GRE sequence to enhance fragment complementation did not depend on the 
presence of the (o tri-peptide. Thus, the GRE a tri-peptide produced a profound 
enhancement of the interaction-dependent activity, but it cannot substitute for the 
interaction. In fact, without the interaction the* GRE tri-peptide does not seem to increase i i< 
therbackground at all, thus it does not either acceIerate;LrefoIding or stabilize the folded .', 
complex. The: most likely effect of the GRE tri-peptide is to stabilize the a 197 fragment by -.-m^ 
interfering with loss of the fragment by amorphous aggregation. Since :the col 98 fragment 
is quite stable, but the a 197 fragment is somewhat-less :so, the latter is expected to be 
limiting for fragment complementation, and any stabilization of a 197 leading to an increase 
in its concentration would increase the steady state, activity of the interaction-activated 
enzyme accordingly. Though the GRE tri-peptide could inhibit aggregation of a 197, it 
apparently did not interfere with re-folding of the fragment complex. Since aggregate 
formation proceeds exponentially, it is exquisitely sensitive to small shifts in the inter- 
molecular association rate constants (Dobson, Trends Biochem Sci (1999) 2^:329). Thus, 
even weak binding of the tethered tri-peptide to the interacting surfaces could effectively 
defeat inter-molecular aggregation. As the complementary fragments fold cooperatively 
into the active complex, however, the weakly bound tri-peptide would be readily stripped 
from its binding site by steric strain as the two become separated in the emerging native 
conformation. In this way the general ability of tethered small peptides to stabilize larger 
proteins without interfering with protein folding may be understood. 

When the same random tri-peptide libraries were screened for fos/jun-mediated 
ampicillin resistance in the TGI strain, five clones were recovered on 400/Ag/ml ampicillin. 
With the fos-jun interaction alone TGI cells will not plate above 50 /zg/ml ampicillin. 
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Thus, as before, tri-peptides were selected which substantially increased the level of 
ampicillin resistance produced by the fos-jun interaction alone. This time four different a 
tri-peptides were recovered, each with a different co tri-peptide. 

Pairs a co 

FHT400-1A1, -IBl HSE (cat agt gag) REQ (egg gag cag) 

FHT400-2 A 1 , -2B 1 NGR (aat ggg egg) QGN (cag ggt aat) 

FHT400-4 A 1 , -4B 1 GRE (ggt egg gag) DGR (gat ggg agg) 

FHT400-9 A 1 , -98 1 EKR (gag aag cgt) GRR (ggt agg agg) 

FHT400-10A2, -lOBl NGR (aat ggg egg) GNS (ggt aat agt) .. 

GRE was selected again from the a tri-peptide library, NGR was selected twice from the 
a tri-peptide library, with two different co tri-peptides. In all cases, activation continued to 
be dependent on the fos-jun interaction. However, in contrast to the original GRE tri- 
peptide, activity was enhanced in all cases by the presence of the both the a and w tri- 
peptides. Even the activity, of the .GRE trirpeptideiwas enhanced by the DGRitri-peptide on 
the CO. fragment. Also, the fragments were interchangeable to some extent-.uJDifferent. a tri- - 
peptides could be paired with' different co tri-peptides. ;:The fact that enhanced. activity was 
still fully dependent on the heterologous interaction suggests that the primary effect of the 
peptides was protection of the fragments to which they were attached from aggregation, 
rather than stabilization of the final fragment complex. The latter would be expected to 
confer constitutive activity, independent of the heterologous interaction.^ - 

Tne GRE tri-peptide was also found to stabilize a 197 in trans. When the al97-fos 
and jun-col9^sfusions were co-expressed in the E. coli periplasm with the GRE tri-peptide 
fiised to the N-tbrminus of thioredoxin via a Gly4Ser linker, the cells plated with 100% 
fficiency on 50 /xOTnl ampicillin, whereas cells expressing the al97-fos and jun-a)198 
Wions either alone, without the GRE-trxA fusion, or with a different tri-peptide-//:r/4 

sion, plated with only \ I % efficiency on 50 jag/ml ampicillin. The GRE-trxA fusion 
conferred no resistance to amDicillin in the absence of the interacting helixes, thus it does 
not stabilize the re-folded fragment complex, but rather it must stabilize the a 197 fragment 
since activity is limited by the amount of soluble a 197. Since the GRE tri-peptide had the 
same stabilizing effect on a 197 fragment when a different carrier was used, its activity 
must be context independent. Thus, an r8 kDa enzyme fragment could be stabilized at least 
100-fold by a tri-peptide selected from a random sequence library. As with the tethered 
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tri -peptide, the free GRE tri-peptide could inhibit^^ggfegation of a 197 without apparently 
interfering with re-folding of the fragmei}t<r6mplex. In this case, however, displacement 
of the tri-peptide would have been ^gfeatly assisted by the fact that the effective intra- 
iolecular concentrations of strCictural elements relative to one another would have been 
jyriu^ higher than thetp-=^ptide concentration. In this way the general ability of small 
pe/ptides to stabili^arge proteins in trans without interfering with protein folding may be 
iderstood.^^is phenomenon is not widely appreciated, and in fact this may be the first 
dfemonsjF^ion that a functional protein could be deliberately stabilized by something as 
sm^Has a tri-peptide. 

Example 7 

Mutationally-Enhanced Fragment Complementation 

' r.The ability of tri-peptides to stabilize p-lactamase fragments and thereby to increase 
:. both the* interaction-dependent activity and activationrindex:of'the TEM-1 al97/a)198 <- 
i complex should be of great benefit for in vitro applications: of p-lactamase vfragment 
i complementation, where utility is. most limited by fragment instability. Thus, it- was of 
interest to determine if a comparable stabilization of the a 197 fragment could be achieved 
by random mutagenesis and selection. To test this, the al97 coding sequence was 
mutagenized by error-prone PGR (Cad well and Joyce,; 1995, supra). The PCR conditions 
of Cadwell and Joyce mis-incorporate nucleotides in an unbiased fashion at a rate of one 
mutation every — 150 nucleotides. Since the a 197 coding sequence is actually about 520 
nucleotides in length, and -75% of mutations change the encoded amino acids, less than 
three coding changes per molecule should be produced. About 10^ clones of the a 197 
mutant library were collected and co-expressed as the jun helix fusion with the fos helix 
fusion of wild-type q)198. The mutagenized al97-jun fusion was expressed from the pAEl 
vector and the fos-col98 fusion was expressed from the pAOl phagemid vector (see Figure 
6). When both constructs were co-expressed in strain DH5a colonies were recovered in 
the presence of 600 /ig/ml ampicillin. Upon sequencing, two of three clones recovered 
(F1600-1 and -3) had the same sequence with two coding mutations, K55E (aag^gag) and 
M182T (atg-^acg). The third clone (FI600-4) also had two coding mutations, one of 
which was shared with the other two (M182T), and the other of which, P62S (ccc-^tcc), 
was proximal to the other mutation of the other clones. 
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Cells expressing either mutant consistently plated at >30% efficiency on lOO/xg/ml 
ampicillin, whereas cells expressing the wild-type al97 plated at < 10^ colonies per cell 
on 100 fig/m\ ampicillin, and -30% on 25 fig/ml ampicillin. However, for both mutants, 
plating efficiencies were just as high or higher in the absence of the heterologous 
interaction, i.e., with the jun helix removed. An exhaustive search for more mutations did 
not turn up any mutants with interaction-dependent activity. Thus, in contrast to the results 
obtained with random tri-peptides, where activation remained interaction-dependent, 
adaptive mutations of a 197 invariably eliminated interaction dependence. This may be 
understood as follows. The tri-peptides stabilized the fragments by reversibly interfering 
with aggregation. Reversibility allows them to inhibit aggregation without interfering with 
folding. However, mutations are not reversible in this sense. If aggregation is caused 
primarily by the inter-molecular formation of native folding contacts, disruption of these by 
mutation might be expected to interfere with folding.'-In fact, it may be thermodynamically 
impossible to stabilize the fragments.by mutation without -inhibiting the re-folding process i 
required to form the active fragment complex. This is because the native folds:.off,the 
fragments have too much exposed hydrophobic surface to be stable. Thus, mutations can 
only stabilize the fragments by stabilizing alternative folds, which minimize exposed 
hydrophobic surface. However, these alternative folds must be unfolded before the native 
folding pathway can proceed to the active complex, and the energy required for this 
process may be prohibitive. 

Since most aggregation is driven by aggregation-prone intermediates in the folding 
pathway, the rate of aggregation is proportional to the lifetimes of such species. The effects 
of the break-point disulfide described above indicated that the fragments are capable of 
association and initiation of folding in the absence of the heterologous interaction, but that 
the folding process is aborted when the fragments are not held together in some way, such 
as by the heterologous interaction or by the formation of a disulfide at the break-point. In 
the absence of either of these the probability that the fragments will dissociate before 
folding is complete is proportional to the folding rate, which in turn is proportional to the 
lifetimes of the folding intermediates. Thus, if the most likely mechanism for mutational 
inhibition of aggregation is to destabilize folding intermediates, this would also accelerate 
folding and thereby reduce the probability that fragment dissociation would occur before 
folding were complete. In this way it may be understood why mutations which stabilize 



PATENT 



49 ATTORIsMbOCKET NO. PARE.002.01US 



the folded complex are more likely to be selected than mutations which stabilize the 
fragments, and why the former, but not the latter would give rise to constitutive, 
interaction-independent activity. 

For the TEM-1 p-lactamase of E. coli, the type member of the Class A 
5 penicillinases, fragments have been identified which can complement to form active 
enzyme when and only when the "break-point" termini of the fragments are fused to 
proteins or other molecules which interact with each other directly or preferably through a 
second molecule. Furthermore, the subject invention presents new methods whereby 
enzyme fragments capable of interaction-dependent complementation may be identified and 
10 modified specifically to confer dependence of their activity on the interacdon of 

heterologous domains fused to the break-point termini. ;Ligand-activated or interaction- 
activated p-lactamases can be activated in multiple locations, including the bacterial 
periplasm, bacterial cytoplasm, eukaryotic cell cytoplasm,.- or in vitro. They are highly 
In active against a wide variety of substrates, including antibiotics, chromogens^ and \ • 

m 15 fluorogens; as well. as p-lactam' pro-drugs, proTantibiotics,'and pro-nutrients, >which can . ; 
-f thus be used for both positive and negative viability selection and xolor selection.- The , 

m utility of p-lactamase fragment complementation systems has .been demonstrated for 

!~ monitoring interactions between and among cell-surface receptors, antibodies, and random 

^ peptide libraries displayed on the surface of a namral protein." 

Example 8 

Construction of a Human Peripheral Blood Lymphocyte Proteome Interaction 

Library. 

The large number of functional interactions among both membrane-bound and 
25 secreted proteins of circulating immune cells include many which are yet to be discovered. 
For example, among the 150 or so CD antigens discovered so far, functions and ligands 
remain unknown for a substantial fraction (Ager et aL, in Immunology Today Immune 
Receptor Supplement, 2"^ Ed, (1997). In addition, the highly combinatorial mechanisms by 
which signalling specificity is generated imply that many signalling proteins participate in 
30 multiple functional interactions, and that even the best known of these proteins may have 
ligands and functions which remain to be discovered. Thus, the functional interactions of 
the extra-cellular proteome of the circulating cells of the immune system represent a 
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potentially rich reservoir of pharmacological targets which are not readily accessible by 
currently available interaction mapping technologies. This proteome presents a unique 
opportunity to demonstrate the power of interaction-dependent p-lactamase fragment 
complementation systems for interaction mapping in that, while many important 
interactions remain to be discovered, many are already known by which the efficiency of 
the system can be gauged. 

As discussed above, the activation index is the most important parameter of the 
interaction-dependent fragment complementation system for cleanly discriminating bona 
fide interactions from large pools of non-interacting protein pairs. Thus, for this 
application one would use the P174/N175 fragment pair of TEM-1 p-lactamase (a 174 and 
CO 175) because with the break-point disulfide this pair has the largest activation index, 
— 10^. It also has a robust specific activity, but this could probably be improved even 
further with some fragment-stabilizing tri-peptides, so one may first wish to insert the VRK 
or NNK tri-peptide library into the: expression vectors! between the break-point .cysteines 
and the linkers (see Figure 6), and^select for growth .on,r300-800 |ig/ml ampicillih. / So. long 
as the activation index is not compromised, higher specific activity conferred by; fragment-^ 
stabilizing tri-peptides should allow weaker bona fide interactions in the expressed 
sequence libraries to confer selectable activity. In order to maximize the quality of the 
expressed sequence library, one might wish to subject the full-length cDNA library first to 
a normalization protocol to normalize the frequencies of rare and abundant sequences. 
From this normalized cDNA one would then prepare random primed cDNA by PGR, and 
size-select fragments > 200 base-pairs to enrich the library for sequences which encode 
fragments which are at least the size of single protein domains. Finally the library could 
be run through a fold-selection protocol to enrich for coding sequences which are expressed 
in the correct reading frame and in register with autonomously-folding protein domains 
(AFD). 

Rough microsomes, which are derived from membranes of rough ER and are 
therefore enriched in mRNA for secreted and membrane proteins, may be isolated from 
unfractionated lymphocytes from pooled human blood by sedimentation velocity in sucrose 
density gradients (Gaetani et al.. Methods in Enzymology (1983) 96:3; Natzle et aL, J Biol 
Chem (1986) 267:5575; Kopczynski et al., Proc Natl Acad Sci (1998) 95:9973). 
Messenger RNA may then be purified from the rough microsomes using a commercially 
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available kit (e.g., Poly(A) Select, Promega, Inc., Madison, WI). A randomly-primed 
cDNA library is tiien made from the RNA template and cloned directionally. First-strand 
cDNA is made with AMV reverse transcriptase (RT) and random hexamer primers 
(Sambrook et aL, 1989, pp. 8.11-8.21). The primers contain a unique 5' extension with 
convenient restriction sites for ligation into the p-lactamase a and o) fusion expression 
vectors. The template is destroyed by the RNAseH activity of AMV RT and the unused 
primers are removed using a spun colunm. The second strand is then made with the 
Klenow fragment of DNA polymerase I and random hexamer primers containing a different 
unique 5' extension with a different restriction site for insertion into the expression vectors. 
After removal of unused primers, the cDNA is PCR-amplified with primers corresponding 
to only the unique sequence on each original primer (Dieffenbach and Dveksler, in PCR 
Primer: A Laboratory Manual, Cold Spring Harbor Press, cold Spring Harbor, NY, 
1995), so that the majority of amplified fragments have thexorrect orientation for- 
expression. invjE". coli. .The product is then normalized by exhaustive hybridization to a 
limiting amount of human genomic DNA inmiobilized onjfhagnetic beads (Kopczynski et 
aL, 1998, supra): Since coding sequences are naturally normalized in genomic DNA, 
cDNA recovered from the genomic DNA hybrids should be normalized. After a fmal 
amplification, the PCR product is size selected by centrifugal gel filtration on Sephacryl S- 
400 spun columns- for fragments > -200 bp. The cDNA is. then digested with appropriate 
restriction enzymes and ligated into the interaction-dependent p-lactamase a 174 and 0)175 
fusion expression vectors, which are essentially the same as those shown in Figure 6, 
except for some modifications required for fold selection. The vectors and protocol for 
fold selection and interaction mapping of the cDNA library are illustrated in Figure 9. 

For convenient fold selection, both vectors for expression of the library as a and co 
fusions are compatible phagemids. In addition, a peptide epitope tag, such as the well- 
known 12-mer derived from the c-myc oncogene (Hoogenboom et al., 1998, supra) is 
encoded at the C-terminus of the cDNA, or expressed sequence (ES) library in the a-fusion 
vector, and at the N-terminus of the ES library in the co-fusion vector. When co-expressed 
with an anti-tag scFv, such as the anti-myc 9E10 scFv (Hoogenboom etal., 1998, supra) 
fused to the other p-lactamase fragment, each fusion library can be enriched for clones 
which express autonomously folding domains (AFD) in the correct reading frame. The 
principle of the selection is that only fragments which can fold into their native 
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conformations will be stable enough to support selectable levels of p-lactamase fragment 
complementation driven by the tag-anti-tag interaction. 

The normalized cDNA library-vector ligation products are transduced into E, coli 
strain TG-1 by high- voltage electroporation (Dower et al.. Nucleic Acids Res (1988) 
76:6127), and plated onto the minimum ampicillin concentration on which non-interactors 
are known to plate with efficiencies of <10*^ since at least a 100-fold excess of non-AFD- 
encoding fragments is expected in the libraries. For the al74/col75 system, the 
recommended ampicillin concentration would be —25 jig/ml. Since there is not likely to 
be more than 10'^ secreted or membrane protein genes expressed in PBLs, and the 
frequencies of expressible AFDs may be in the range of 10'^ per gene, one should collect at 
least 10^ clones of each library to insure representation of all expressible extra-cellular 
AFDs. 

I Once the normalized ES libraries have been enriched for AFD-encoding clones, the . , 
libraries can be rescued as filamentous phage by high-multiplicity super-infection of :at least . 
10^ cells of each library with the helper phage M13K07 (Sambrook et al., 1989, pp. -4.17^. ■ ; 
4:19). After overnight growth in suspension the library phage are recovered from^the ^ ^ . 
culture supernatant by precipitation with polyethylene glycol, and reconstituted in r 
phosphate-buffered saline. The library phage stocks may be stored frozen in 15% glycerol. 
Fresh E. coli TG-1 cells may then be co-infected with a high-multiplicity of each phage 
library and plated onto a concentration of ampicillin on which the activation index of the 
system is known to be maximal. For the al74/(jcil75 system, ICQ jig/ml ampicillin is 
optimal, since the activation index is at least 10^ and the fos-jun interaction-mediated 
plating efficiency is at least 50%. At least lO''* transforming units of each fusion library 
phage should be used to infect at least 10'^ log phase TG-1 cells to insure that most of the 
possible pair-wise combinations of 10^ clones of each AFD library are present in the 
doubly infected cell population before selection. After a one-hour adsorption at 10^ cells 
per ml, the cells are washed, resuspended in fresh medium, and incubated for another hour 
with gentle shaking to allow the phagemid genes to express. The cells are then 
concentrated and plated on 100 large petri dishes (150 mm dia.) containing solid LB 
medium containing 1 mM IPTG and 100 f.ig/ml ampicillin. A small aliquot is plated on 
chloramphenicol and kanamycin to determine the number of co-transformants. 

Since - 10'^ cells are being seeded onto each plate, it is possible that the interaction 
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frequency might be high enough for the plates to overgrow. This would take at least lO'^ 
clones per plate. In this case, all of the selected clones would have to be recovered by 
scraping and replated at lower densities. If a large number of clones is recovered, at least 
100 should be replated anyway to determine the background frequency due to ampicillin 
escapes. From those that breed true, each candidate interactor should be recovered and 
tested for interaction with an unselected partner. Selected pairs should be sequenced and 
BLAST-searched for homology to known genes (Altschul et al., J Mot Biol (1990) 
275:403; Ahschul et al. , Nucleic Acids Res (1997) 25:3389). A large number of 
interactions among secreted and membrane proteins of immune cells are:already known, 
such as the B-cell co-activation antigen, CD40 and its T-cell ligand, CD40L, and the T-cell 
activation antigens B7.1 and B7.2 and their ligands CD28 and eTLA4, Labeled 
oligonucleotide hybridization probes may be prepared for these known interactions, and 
colony lifts of the entire interaction library may be probed to see. what fraction of expected 
interactors are actually represented in* the library. Interaction-pailner sequences from 
positive clones may -be recovered, and homology searched to determine if known or: new 
interactors have been .identified. Colonies expressing bona fide , interactions may be grown 
up and stored indefinitely in 15% glycerol at -70*'C, pending further characterization or 
use for e.g., drug screening. 

Example 9 

Construction of an Intra-Cellular Signal Transduction Biosensor. 

Interaction-dependent p-lactamase fragment complementation systems can be 
adapted for activation or inactivation by virtually any post-translational modification that 
occurs namrally in cells. As a result they may be deployed intra-cellularly as biosensors to 
monitor the activity of any process which is regulated by post-translational modification. A 
major class of such processes is phosphorylation- regulated signal transduction pathways. 
Phosphorylation-regulated intermediates are obligatory components of most processes by 
which cells respond to extra-cellular conditions or messenger molecules by altering gene 
expression. Cellular responses to extra-cellular signals may be fall into three general 
categories, growth, survival, and differentiation. A ubiquitous component of neoplastic 
transformation is the deregulation of growth control signaling, often accompanied by the 
deregulation of survival signalling as well. This often occurs by over-expression of 
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phosphorylation-regulated signal transducers, or by mutational disabling of 
phosphorylation-mediated regulation. Thus, most so-called oncogenes are phosphorylation- 
regulated growth signal transducers, which become over-expressed or mutated to 
constitutive activity in cancer cells. 

The Her-2/neu oncogene is a 185 kDa Type I transmembrane receptor tyrosine 
kinase, which is a member of the epidermal growth factor receptor (EGFR) family. This 
growth factor receptor is over-expressed in particularly aggressive adenocarcinomas of 
epithelial origin in a number of tissues, notably breast. When normally expressed, Her- 
2/neu hetero-dimerizes with other EGF-family receptors when they are ligated by growth 
factor. This leads to cross phosphorylation of multiple tyrosines on the cytoplasmic 
domains of the receptors. Phosphorylation of tyrosine 1068 (Tyrl068) on Her-2/neu leads 
via phospho-tyrosine-binding accessory proteins and guanosine nucleotide exchange factors 
to activation of p2r'", and thence to activation of cell division^ via the MAP kinase cascade. 

-•"When Her-2/neu is sufficiently/overrexpressed, the background level of ligand- . -.i .^-i *. 

^independent EGFR hetero-dimerization rises: to a level;whieh is in turn sufficient to: y rr-' - ; h. 
maintain constimtive mitogenic signaling ,even. in the absence of growth factor, 'leading to , z 
the characteristically uncontrolled growth of tumor.cells. Thus, there is much interestrin 
finding drugs which can block the activation of Her-2/neu, particularly in a manner which 
can prevent constitutive signaling in tumor cells without blocking EGF signalling in normal • 
cells. 

A cell-based biosensor, which produces a readily detectable and quantifiable signal 
when Her-2/neu activation is blocked, would be particularly useful for high-throughput 
screening of chemical libraries for compounds with anti-breast tumor potential. Such a 
biosensor may be set up with a p-lactamase fragment complementation system as follows. 
The 0) fragment could be fused via flexible linker to the C-terminus of Her-2/neu, which is 
proximal to the Tyrl068 substrate of the receptor kinase. The a fragment could then be 
fused to a binding protein, such as a scFv or VL, which binds to the Tyrl068 region of the 
receptor only when Tyrl068 is unphosphorylated. Since Tyrl068 is mostly phosphorylated 
in Her-2/neu over-expressing cells, especially in the presence of EGF, p-lactamase 
activation would be minimal. However, in the presence of an inhibitor of Her-2/neu 
activation, the proportion of unphosphorylated Tyrl068 would rise, recruiting the a- 
Tyrl068 binder fusion to the receptor where a-co complementation would increase p- 



PATENT 



55 ATTORNHMDCKET NO. PARE.002.01US 



lactamase activity in the cells. In the presence of a fluorogenic p-lactamase substrate, 
inhibitors of Her-2/neu activation could be readily identified by increasing fluorescence in 
a matter of minutes, since dephosphorylation of Tyrl068 occurs rapidly upon inhibition of 
the Her-2/neu kinase activity, 

5 For intra-cellular biosensors both maximum activity and the activation index will be 

important. However, for all five of the best TEM-1 fragment pairs the activation index is 
expected to depend almost entirely on the difference in the affinity of the binder for Tyr vs 
phospho-Tyr, Thus, the fragment pair with the highest activity, i.e., G253/K254 (a253 
and co254), would be preferred, especially since for intra-cellular applications the break- 

0 point disulfide cannot be used. It may be possible to increase the intra-cellular activity of 
a253/(o254, if desired, by selecting one or two fragment stabilizing tri-peptides, as 
described above. 

le first step, in. developing the Her-2/neu inactivation biosensor would be to obtain 
a Tyrl068rbinding:protein. /.This could be accomplished by insertingithe coding. sequence* 
5.. . - for the subs^aterpeptide, PVPEYINQS, into the active site of thioredoxin,- between G33 i; . 
and P34, flanked 'by short- flexible linkers such as PGSGG to minimize structural r' 
constraints on the .peptide, which does not require a rigid structure for. binding to its natural 
ligand, the Grb2 SH2 domain. This Tyrl068 trxpep can then be fused via a (Gly4Ser)3 
xi^^UnKCT to the N-terminus of ca254, and co-expressed in E. coli TG-1 cells with a scFv 
0 \J Iprary of at least lO^Vlones, or a VL library of at least 10^ clones fused to the C-terminus 
OT a253 via the (Gly4Ser)3 linker. Since the Tyrl068-binder is being selected for 
deployment in the manmigJian cell cytoplasm, it might be prudent to perform the selections 
in the E. coli cytoplasm. Ftor this purpose the vectors in Figure 6 could be used with the 
signal peptides removed. Them a chromogenic substrate such as nitrocefin {Krmx = 485 
5 nm; z = 17,420 M"' cm"'; McManus-Munoz and Crowder, Biochemistry (1999) i§:1547) 
would be used to select Tyrl068^inders by color. By plating at least 10^-10^ 
transformants at moderate to high stringency, i.e., on decreasing concentrations of the 
substrate, it should be possible to identify binders with sub-micromolar affinities since Tyr 
is the most common amino acid in higVaffinity protein-protein interfaces. Such affinities 
) will be desirable for maximum discrimination between Tyr and phospho-Tyr. Selected 
Tyrl068-binders must be tested for inhibition by phosphorylation of the Tyr. This can 
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ily be Womplished by expressing the vectors in isogenic cells which over-express a 
road spectVm Tyrosine kinase (TKXl cells, Stratagene, Inc., La Jolla, CA). 

a suitable phosphate-sensitive Tyrl068-binder has been identified, the entire 
coding seq^nce for the a253 - Tyrl068-binder fusion may be subcloned into a 
mammalian agression vector, such as the pCMV-Tag vectors (TKXl cells, Stratagene, 
Inc., La Jolla, (SA) for expression in mammalian cells from the cytomegalovirus promoter. 

The 0)254 fragnWt must be expressed as a fusion to the C-terminus of the Her-2/neu 
cytoplasmic domain>avhich contains Tyrl068. The coding sequence of the 1210-residue 
EGF receptor (Genbank accession no. X00588; Ullrich et aL, Nature (1984) i09:418) may 
be used as it is operationkly identical to Her-2/neu, and its Tyrl068 will become 

horylated under the shne conditions of over-expression and/or growth factor ligation 
mor cells. When fused tXthe C-terminus of EGFR via the (Gly4Ser)3 linker, the 35- 
r^rpidue. 0)254 (3-lactamase fragmtents will be only 152 residues away from Tyrl068. Both; . 
5'; .tiie:EGFR-Q)254 fusion and the a253-TyrlG68-binden fusion may be expressed from the -it " 
15 /.i=:v.:same vector from a dicistronic mRN^. -This is. accomplished by.nnserting an internal . 
• ribosome entry site (IRES; Martinez-S&las>: Curr.Opin Biotechnol (1999) 70:458) between- 
the termination codon of the upstream ciStron and the initiation codon of the downstream i 
cistron. This will allow both proteins to b\ made simultaneously from the same mRNA, 
The vector may be introduced into the mmorScell:line by cationic liposome-mediated 
transfection, using e.g., lipofectamine (Gibco-QRL, Gaithersburg, MD) according to the 
protocol in the product literature. Operation of tW biosensor may be tested in transiently 
transfected cells, and if operational, stable transfomants may then be isolated by selection 
for long term antibiotic resistance. Multiple free-difmsible chromogenic and fluorogenic 
substrates are available for continuous monitoring of pMactamase activity. Operationally, 
the 0)254 fragment will be anchored to the plasma memarane at the C-terminus of the 
cytoplasmic domain of the receptor near Tyrl068, and the\a253 fragment will be free in 
the cytoplasm as the Tyrl068-binder fusion. ATP-analog tyrosine kinase inhibitors are 
available commercially and can be used as positive controls f^jr inhibitor selection, and to 
determine the signal increment from fully-activated to fully-inlribited EGFR. 
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Example 10 

A Fragment Complementation System for Neomycin Phosphotransferase. 

Enzyme fragment complementation systems may also be useful for selection for the 
simultaneous incorporation of multiple genetic elements into the same cell or organism. 
For example, the production of secretory IgA antibodies in plants requires the introduction 
of four different genes into the same plant. For practical reasons this requires the 
introduction of at least two and preferably three different DNA molecules. For the 
production of genetically stable transgenic plants, each DNA molecule must carry its own 
selectable marker. The use of multiple antibiotic selection systems on the same 
transformants is cumbersome and inefficient, as the overall false positive and false negative 
rates tend to scale as the product of the rates for the individual antibiotics. Thus, two- or 
three-piece fragment complementation systems for a single antibiotic offer a distinct 
* ■ advantage over multiple antibiotic selection; ii? 

ftor a two fragment- system, dependence of activation on the interaction of 
heterologmis-domairis is: not necessary. However, for simultaneous selection of triple, - s 
transgenics\ complementation . of the enzyme fragment pair must be 'dependent on a • 
heterologousWeraction mediated by a free ligand, analogous to thCvactivation of p- 
amase by tl^ tri-molecular interaction of al97-jun, scFv-o)198, and CD40-fos, as 
N^dj^scribed above\ For these applications, the most important parameter is the maximum 
f activity of the recWstimted enzyme, which is a function of both the specific activity and 
the efficiency of complementation. The activation index is not relevant because the each 
fragment alone will Have essentially no detectable activity, providing a background of zero. 
Thus, to insure recovi^ry of the most competent fragment pairs for intra-cellular activity, 
the fos and jun interactoVs should be used with tri-peptide libraries between the break- 
points and the (Gly4Ser)3 linkers. The tri-peptide libraries will provide stabilizers for each 
fragment so that the selection will be biased toward the fragments producing the highest 
specific activities. For two-frait selection applications, i.e., bi-molecular selections, where 
a heterologous interaction is nW required, specific activity may be increased further by 
mutagenesis and selection for foJd accelerating mutations. For three-trait selection 
applications, selected fragment pairs will have to be tested for dependence on the 
heterologous interaction. In this case, the activation index will be of some importance, but 
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mmiro applications a modest index of 1000 will be more than adequate for clean 
lotions. 

' Neomycin phosphotransferase II (NPTII; Genbank accession no. M77786) is a 267- 
akiino acid enzyme from E. coli which inactivates aminoglycoside antibiotics such as 
neomycin and kanamycin by phosphorylation from ATP. NPTII is widely used as a 
selectable marker for plant and animal cell transformation. Thus, fragment 
complementation systems for NPTII would be particularly useful for facile generation of 
multiple-trait plant and animal transgenics. The three-dimensional structure of NPTII is 
not known,} and its homology to known structures is too low for reliable prediction. 
However, as described above, empirically-derived neural net algorithms are available 
which allow fairly accurate prediction of secondary structure and solvent exposure for any 
protein sequence. The best of these algorithms is the PredictProtein program of Rost and 
Sander (1993, 1994, supra). Application of this program. to the protein sequence of NPTII n 
^produced the result shown in Figure 10. Ten regions: of the sequence have been predicted u.::T.f 
itoihave little secondary structure and to: be exposedsto solvent, and therefore to be potentiaL-^ir:. 
'Sites for productive fragmentation. Fragment pairs corresponding to breakage in the center • 
of each of these ten regions, or at two equally-spaced sites in the longer regions, may be 
generated by PGR with appropriate primers, and subcloned into vectors like those 
illustrated in Figure 6 for expression as the fos and jun helix fusions with intervening 
linkers. The vectors would differ from those in Figure 6 in not encoding signal peptides, v 
and the pAOl vector would have ampicillin resistance instead of kanamycin resistance. 
Also, the vectors should contain VRK or NNK random tri-peptide-encoding sequences 
between the cloning sites for the enzyme fragments and the (Gly4Ser)3 linkers. 

The PGR product for each fragment is restriction digested and ligated into the 
appropriate vector, a fragments into the pAEl-type vector and co fragments into the pAOl- 
type vector. The ligation products are then introduced into TG-1 cells by high-voltage 
electroporation, and plated onto chloramphenicol or ampicillin. At least lO'* transformants 
should be collected for each fragment. Also, kanamycin sensitivity should be determined 
for each fragment library, both to prevent false positives and to determine the minimum 
quantitatively selective kanamycin concentration. This should be the concentration on 
which single fragment plating efficiencies are < 10"^, since the frequencies of the fragment- 
stabilizing peptides could be this low. Since - 10^ co-transformants will be needed for 
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each fragment pair for complete coverage of the tri-peptide libraries, quantitative phage 
infection should be used to combine the two libraries for each fragment pair. This is 
accomplished by rescuing the co-fragment libraries (in the pAOl-type phagemid vector) as 
phage using M13K07 helper phage as described above. For facile quantitative infection at 
least 10^ cells bearing each a fragment library should be inoculated with at least 10'* phage 
bearing the corresponding co fragment library. After one-two hours in suspension culture 
with gentle shaking to allow phage adsorption, penetration, and initiation of gene 
expression, the cells of each fragment pair are centrifuged, washed, and plated onto ten 
150-nmi dishes containing solid LB medium with the minimum quantitatively selective 
concentration of kanamycin. 

After overnight growth at 37^C, all kanamycin-resistant colonies may be pooled 
and re-plated onto increasing concentrations of kanamycin to identify those tri- 
peptide/fragment pair combinations producing the highest levels of kanamycin resistance. 
As many of the most active, clones as necessary*:should be tested* for dependence of activity .- , 
' on the, fos-jun interaction.uThisfcan most easily be accomplished by removing; one ;of the ^ . 
helixes by restriction digestion at sites in the gene construct included for-this purpose. The c 
digestion products are then re-ligated, re-transformed into TG-1 cells, and replated on 
kanamycin. As explained above activation indexes of 1000 are more than adequate, so the 
most active pairs with indexes of at least 1000 would be optimal. For tri-molecular 
activation in the cytoplasm, two hetero-dimerizing helix pairs may conveniently be used, 
such as the parallel-binding helixes from fos and jun as described above, and the anti- 
parallel-binding helixes from yeast DNA topoisomerase II (TopII; Berger et al., Nature 
(1996) 379:225). One of each helix pair would be fused to an NPTII fragment, and the 
other two helixes would be fused to each other, so that the NPTII fragments would only 
come together when the 2-helix fusion was present to form the tri-molecular complex. For 
example, an a-TopIIN fusion and a fos-co fusion could only be brought together and 
activated by a jun-TopIIC fusion. Genes encoding each of the three fusions could then be 
distributed among three different DNA constructs which also encode genes of interest. In 
this way eukaryotic cells could be transformed with a mixmre of the three different 
constructs and selected for the simultaneous presence of all three genes in the same cell 
simply by selection for growth on a single antibiotic. 
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Example 11 
Target- Activated Enzyme Prodrug Therapy. 

Antibody-directed enzyme prodrug therapy (ADEPT) is a promising anti-cancer 
chemotherapeutic strategy which takes advantage of the catalytic power of enzymes to 
amplify the cytotoxicity-targeting power of tumor-specific antibodies. Enzymes are 
concentrated at the tumor site when administered as conjugates of tumor-specific 
antibodies. After unbound conjugate has cleared from the circulation, prodrugs may be 
administered which are relatively non-toxic until activated by the tumor-bound enzyme, 
whereupon the cytotoxic product may accumulate at the tumor site to concentrations which 
would be unattainable by parenteral administration of the drug without excessive toxicity. 
Enzymes such as p-lactamase have been chemically or genetically conjugated to tumor- 
targeting antibodies and used with p-Iactam derivatives of anti-tumor drugs such as 
cephalosporin mustards and -anthracyclines to achieve promising anti-tumor effects in 
animals'. The efficacy of ADEPT is limited, however,, by^the /need for unbound conjugate to 
clear ithei;circulation before the .prodrug can be administei*edi . By the time the circulating: ' x 
conjugate is depleted to the threshold below whicbisystemic activation of: the prodrug 
would produce acceptable levels of toxicity, so much of the conjugate has been lost from 
the tumor that efficacy is often seriously compromised. 

•;. This problem may be overcome by using an. interaction-dependent p-lactamase 
fragment complementation system with nimor targeting antibodies. When fused to single- 
chain antibody fragments (scFv) which recognize non-overlapping epitopes on tumor 
markers, the p-lactamase fragments can localize to the tumor and reconstitute sufficient P- 
lactamase activity on the tumor cell surface to produce high levels of tumor-localized 
cytotoxicity from p-lactam prodrugs. The great advantage of such a system is that prodrug 
activation cannot occur in the general circulation or anywhere the tumor marker is not 
encountered, so that the prodrug may be administered either simultaneously with high 
doses of the scFv-fragment fusions, or at the point of highest tumor load of the fragments, 
without regard for the circulating levels of the fragments which would be completely 
inactive. 

As an example, the construction and purification of fusions of interaction-dependent 
p-lactamase fragments with scFv which bind non-overlapping epitopes on the human breast 
tumor marker Her-2/neu is described. One may then determine the kinetics of 
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reconstitution of p-lactamase activity on the surface of Her-2/neu - expressing SK0V3 
human ovarian cancer cells. Under conditions of optimum loading, killing of the cells may 
then be assessed for different cephalosporin prodrugs as a function of concentrations known 
to be limiting in vivo. The resulting Tumor- Activated Enzyme Prodrug Therapy (TAcEPT) 
system may then be tested for its ability to ablate SK0V3 and other Her-2/neu-expressing 
human tumors in severe combined immuno-deficient (scid) mice. Once the efficacy and 
safety of the system has been demonstrated in animal models, toxicity and efficacy trials 
may be initiated in human breast cancer subjects. 

The requirements for therapeutic use of p-lactamase fragment complementation 
systems are similar to those for in vitro use in general. The most important parameters are 
specific activity and fragment stability, while activation indexes above 1000 confer little 
additional efficacy. Thus, the a253/co254 would be the recommended fragment pair for 
this application because it has. the highest interaction^dependent specific activity, the 
fragments are moderately stable, and its: activation index is more than adequate. -However, 
the stability of the a253 fragment couldvprobably be , improved by a custom-fragment-: . . . ■' 
stabilizing tri-peptide. Thus,- before setting up the mmor-activated system, one might first 
subclone a degenerate sequence encoding the VRK or NNK tri-peptide library into the 
a253 expression construct between the break-point cysteine and the linker (see pAEl in 
Figure 6). a253-stabilizing tri-peptides would then be selected by plating at least lO'* 
library transformants on increasing ampicillin from 400 to 1000 fig/ml, since a253/co254 
plates quantitatively on 400 |ig/ml even without a stabilizing peptide, and wild-type TEM-1 
p-lactamase does not plate on more than 1000 fig/ml when expressed under these 
conditions. 



11a. Expression of TEM-1 p-lactamase H25-G253 (a253) and K254-W288 ((o254) 
fragments as fusions to scFv against non-overlapping epitopes on the Her-2/neu human 
breast tumor marker. 

J ^ "fee tumor activation mechanism for these fragments may employ two scFvs such as 
le descwbed by Schier et al. {Gene (1996) 769; 147), which were derived from a phage 
dis4)lay libraV of a human non-immune repertoire (Marks et al., 1991) by panning against 
a recombinant fragment comprising the extra-cellular domain (ED) of Her-2/neu. These 
two scFv, appeaXto recognize non-overlapping epitopes, since they do not compete for 
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binding to tfte Her-2/neuED by ELISA. The affinity of one of these scFv was improved to 
sub-nM Kd inNntro (Schier et al., 1996, supra), and similar improvements in the other 
could be made ushm the same methods (Balint and Larrick, Gene (1993) 7J7:109), The 
coding sequences foNhe scFv may be subcloned into the p-lactamase a and co fusion 
production vectors, ppiam and ppiaco), shown in Figure 11. These vectors are derived 
from pET26b (Novagen), and have convenient restriction sites for insertion of both scFv 
and^-lactamase fragment sequences. Each fusion protein is inducibly expressed (IPTG) 
the strong phage T7 promoter under the control of the lac repressor. Each primary 
tr^slation product 'contains a pelB signal peptide for secretion into the bacterial periplasm 
and a C-terminal His^ tag for one-stejvmirification from osmotic shock extracts by 
immobilized metal ion affinity chromatography (IMAC, Janknecht et aL, Proc Natl Acad 
Sci (1991) 55:8972). The yield of each fusiV protein can be optimized primarily by 
manipulation of the inducer concentration and the growth temperature. 

h scFv may be expressed as both a . and co : fusions' to determine which ; • 
arrangemenffe); (l) support the highest binding? activity •3(2): support the highest enzymatic ^r-< 
activity; and (SWipport the highest yields. Initially, expression may be optimized by the 
criterion of silver-stained PAGE. Then fusion proteins: should be purified from osmotic 

ck extracts (Neu afid Heppel, 1965, supra) by IMAC. The purified fusion proteins may 
b tested' for binding to ata immobilized recombinant fusion of the Her-2/neu extra-cellular 
omain (ED) to a stabilizin\immunoglobulin domain (Ig) by ELISA using an anti-His^ tag 
antibody (Qiagen). The purifibd fusion proteins may then be tested for reconstiuition of p- 
lactamase activity on immobilizeasrc- Her-2/neu ED-Ig using a chromogenic substrate, 
nitrocefm (A.max = 485 nm; s = 17\420 M'' cm"'; McManus-Munoz and Crowder, 1999, 
supra). Immobilized BSA may be used^ the negative control. 

lib. Determination of the kinetics of specific P-lactamase activation by binding of p- 
laca/(o-scFv fusions to immobilized recombinant antigen. 

One may determine P-lactamase activity quantitatively as a function of binding of 
the fusion proteins to the immobilized antigen. This rate may then be compared to that 
obtainable with intact p-lactamase fused to the same scFv as an indication of how much 
activity may be localized on a tumor compared to an established vehicle, i.e., an antibody- 
P-lactamase conjugate. 
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First, conditions are established for saturating the antigen with one of the scFv-|3-lac 
fragment fusion proteins. The wells of microtiter plates are coated with antigen, and 
exposed to increasing amounts of the first scFv-fragment hasion until the ELISA signal 
plateaus. At this level, i.e., saturating amounts of the first fusion protein, increasing 
amounts of the second fusion is added. After binding and washing, P-lactamase activity is 
determined spectrophotometrically after a 30* incubation with excess nitrocefm. If the 
assay is performed in triplicate, V,^ should be a more or less linear function of the 
concentration of the second fusion. As the amount of second fusion is increased, at some 
: point V,^ should plateau. The amount of the second fusion bound can be determined by ; 
ELISA, and a relative specific activity (k^ar) may be computed for the fragment- 
reconstituted p-lactamase. The may be estimated in solution with saturating antigen and 
saturating first fusion and limiting amounts of the second fusion. A range of nitrocefm 
i concentrations is added and the initial rates of change of absorbance at 485 nm is measured 
. as a function of second fusion concentration.'^ The A!^ is-.then computed from standard:;- 
/ regression analysis. • ■ - . ■ . < : . . . - . v- 

To compare with intact p^actamase, a fusion of intact p-lactamase to the;second ^ 
scFv may be prepared. This is then added in increasing amounts to antigen-coated wells 
which had been saturated with the first fusion as had been done before. Again, V,^ should 
be a more or less linear function of the amount of intact p-lactamase fusion and should 
plateau at samration. At each point, the amount of intact p-lactamase fusion bound, as 
determined by ELISA, should be comparable to the amount of the second fragment fusion 
bound, and the ratio of V^^x should reflect the ratio of specific activities of the intact and 
fragment-reconstituted p-lactamases. For comparison, the K/^^ should be estimated as 
described above for the fragment-reconstituted enzyme. The TEM-1 a253/co254 fragment 
complex is expected to have a maximum activity (k^^,) near that of the intact enzyme. If the 
Kj^ are also comparable, activities on a tumor up to 100-fold higher at the peak of prodrug 
activation than with the conventional antibody-p-lactamase fusion might be expected, which 
may have 1 % or less of its peak activity left when the unbound fusion has cleared the 
circulation enough to allow prodrug administration. 

He. Determination of killing kinetics of Her-2/neu-cxpressing SKQV3 ovarian carcinoma 
cells by scFv-mediated p-laca/co activation of cephalosporin prodrugs. 
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The arrangement(s) of scFv-p-lactamase fragment coupling which produce(s) the 
highest specific p-lactamase activities on immobilized antigen may then be tested for 
activation of p-lactamase activity in the presence of human tumor cells expressing the Her- 
2/neu antigen. Cell killing may be assayed using any of the three cephalosporin prodrugs 
shown in Figure 5. The fragment-reconstituted activity may again be compared with the 
intact p-lactamase activity, this time with respect to tumor cell killing. Such results should 
indicate the dose range which may be required to show a significant anti-tumor effect in 
animals, which will be the next step in preclinical evaluation of the tumor-targeted p- 
lactamase. - : 

The SK-OV-3 line of human ovarian adenocarcinoma cells (ATCC) may be seeded 
in 6-well tissue culture plates at 3x10^ cells per well in Dulbecco's Minimum Essential 
Medium (DMEM) supplemented with 10% fetal calf serum (PCS), and allowed to grow to 
confluency at:37°C in 10% CO2. The saturability of both Her-l/neu epitopes on the cells 
may be determined with increasing amounts of intact p-lactarhase fused to each.scFv/as 
determined spectrophotometrically. after nitrocefin hydrolysis., The V^of the fragment- 
reconstimted enzyme may then be determined on the cells .with samrating concentrations of 
both fusions and nitrocefin. It would be expected to conform to the predicted activity 
based on the maximum intact p-lactamase activity and the ratio of V,^ observed on the 
immobilized recombinant antigen. The sensitivity of the cells to any of the three prodrugs 
shown in Figure 5 may be determined essentially as described by Marais et al. {Cancer 
Research (1996) 56:4735) with and without the intact p-lactamase-scFv fusions and the a/co 
fragment-scFv fusions under saturating conditions. The prodrugs are dissolved in DMSO 
and diluted into DMEM/FCS to a range of concentrations immediately prior to use. One ml 
is added to each well and the cells are incubated overnight. The cells are then washed, 
trypsinized, and viability is determined by dye exclusion. Aliquots are then seeded into 
fresh dishes. After four days of growth, cell viability is assessed by incorporation of [^H] 
thymidine as determined by liquid scintillation counting of acid insoluble material. The 
results are expressed as percentage of untreated control cells. Again, the relative 
cytotoxicities of the prodrugs with the p-lactamase fragment system may be compared to 
those of the intact p-lactamase fusions, particularly at the lower prodrug concentrations 
where second order rate constants (k^JK/^) may be important, to give an indication of the 
potential increase in efficacy of TAcEPT over conventional ADEPT in vivo. 
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All publications and patent applications mentioned in this specification are indicative 
of the level of skill of those skilled in the art to which this invention pertains. All 
publications and patent applications are herein incorporated by reference to the same extent 
as if each individual publication or patent application was specifically and individually 
indicated to be incorporate by reference. 

The invention now having been fully described, it will be apparent to one of 
ordinary skill in the art that many changes and modifications can be made thereto without 
departing from the spirit or scope of the appended claims. > 



